Detection of lung neoplasia by amplification of rna sequences

ABSTRACT

Provided herein is technology for lung neoplasia screening and particularly, but not exclusively, to methods, compositions, and related uses for detecting the presence of lung cancer.

CROSS REFERENCE TO RELATED APPLICATIONS

The present invention claims priority benefit of U.S. Provisional PatentApplication No. 62/332,419, filed May 5, 2016, which is incorporated byreference in its entirety.

FIELD OF THE INVENTION

Provided herein is technology relating to detecting neoplasia andparticularly, but not exclusively, to methods, compositions, and relateduses for detecting neoplasms such as lung cancer.

BACKGROUND OF THE INVENTION

Lung cancer is the most frequent cause of death among men and womenyounger than 85 years in the US. It accounts for 27% of all cancerdeaths and 221,000 lost lives annually. This mortality rate exceeds thatof the next 4 highest ranking cancers combined. Gene expressionprofiling has confirmed unique mRNA expression in cancers and can beused as an approach for detection of lung malignancies. An mRNAmulti-marker approach to detect all subtypes of lung cancer needs to beexplored. This study assesses the value of measuring expression levelsof multiple mRNA markers in detecting lung cancer of different subtypes.

SUMMARY OF THE INVENTION

This technology is in the field of nucleic acid detection andquantification. Specifically, the technology addresses the detection andquantification of RNA in samples using single-tube RT-PCR-Invasivecleavage reaction (RT-QuARTS).

In some embodiments the technology provides methods of screening for alung neoplasm in a sample obtained from a subject, the methodscomprising, e.g., a) assaying a sample from a subject for an amount ofat least one RNA marker selected from the group consisting of GAGE12D,FAM83A, LRG1, XAGE-1 d, MAGEA4, SFTPB, AKAP4, and CYP24A1; b) assayingthe sample for an amount of a reference marker in the sample; c)comparing the amount of the at least one RNA marker to the amount of thereference marker to determine a level of expression for the at least onemarker gene in the sample; and d) generating a record reporting theexpression for the at least one marker gene in said sample. In someembodiments the method comprises obtaining a sample comprising RNA froma subject and treating the RNA with a reverse transcriptase, preferablyMMLV reverse transcriptase, to form a cDNA copy of at least a portion ofthe RNA. In preferred embodiments, the cDNA is created and detected in asingle vessel, without opening the vessel, e.g., to add additionalreagents.

In some embodiments the at least one RNA marker is at least two markers.In some preferred embodiments the at least one RNA marker comprises thegroup consisting of GAGE, FAM83A, LRG1 and MAGEA4 markers, while in someembodiments, the at least one RNA marker comprises the group consistingof GAGE, FAM83A, LRG1, CYP24A1, XAGE1D and MAGEA4 markers. In someembodiments, the reference marker is an RNA, preferably an RNA selectedfrom the group consisting of CASC3 mRNA, β-actin mRNA, U1 snRNA and U6snRNA.

In some embodiments the technology comprises assaying RNA using one ormore of a polymerase chain reaction, nucleic acid sequencing, massspectrometry, mass-based separation, or target capture. In particularlypreferred embodiments, the assaying comprises using a flap endonucleaseassay, such as a QuARTS assay, as described hereinbelow.

In some embodiments, assaying the expression of the RNA marker comprisesdetecting increased or decreased expression of the RNA marker relativeto a normal expression of the marker.

Samples suitable for analysis using the technology are not limited to aparticular sample type. In some embodiments the sample is a tissuesample, a blood sample, a serum sample, or a sputum sample. In certainpreferred embodiments the tissue sample comprises lung tissue.

The technology further provides kits, e.g., for practicing thetechnology. For example, in some embodiments the technology provides akit comprising:

a) at least one oligonucleotide, wherein at least a portion of saidoligonucleotide specifically hybridizes to a marker RNA selected fromthe group consisting of GAGE12D, FAM83A, LRG1, XAGE-1 d, MAGEA4, SFTPB,AKAP4, and CYP24A1, and

b) at least one additional oligonucleotide, wherein at least a portionof said additional oligonucleotide specifically hybridizes to areference nucleic acid.

In preferred embodiments the kit comprises at least two additionaloligonucleotides. In some embodiments, the kit further comprises one ormore components selected from the group consisting of reversetranscriptase, flap endonuclease, DNA polymerase, and a FRET cassette.

In some embodiments the at least one RNA marker is selected from thegroup consisting of GAGE, FAM83A, LRG1, CYP24A1, XAGE1D and MAGEA4, andin some embodiments the RNA marker is selected from the group consistingof GAGE, FAM83A, LRG1 and MAGEA4. In certain preferred embodiments thekit comprises at least 4 oligonucleotides, wherein each of the markersin the group consisting of GAGE, FAM83A, LRG1, and MAGEA4 specificallyhybridizes to at least one of said 4 oligonucleotides. In otherembodiments, the kit comprises at least 6 oligonucleotides, wherein eachof the markers in the group consisting of GAGE, FAM83A, LRG1, CYP24A1,XAGE1D and MAGEA4 specifically hybridizes to at least one of said 6oligonucleotides. In preferred embodiments at least one oligonucleotideis selected from one or more of a capture oligonucleotide, a pair ofnucleic acid primers, a nucleic acid probe, and an invasiveoligonucleotide.

The technology is not limited to which particular reference marker RNAis used, and many are known in the field. In preferred embodiments, thereference marker is an RNA selected from the group consisting of CASC3,β-actin, U1 and U6 RNA

The technology further comprises compositions such as mixtures, e.g.,reaction mixtures. In some embodiments the technology provides a mixturecomprising a complex of at least one RNA marker selected from the groupconsisting of GAGE12D, FAM83A, LRG1, XAGE-1 d, MAGEA4, SFTPB, AKAP4, andCYP24A1 and an oligonucleotide that specifically hybridizes to the RNAmarker. In preferred embodiments, the composition further comprises acomplex of at least one reference marker and an oligonucleotide thatspecifically hybridizes to the reference RNA marker. In some embodimentsthe at least one RNA marker is selected from the group consisting ofGAGE, FAM83A, LRG1, CYP24A1, XAGE1D and MAGEA4, while in someembodiments the least one RNA marker is selected from the groupconsisting of GAGE, FAM83A, LRG1 and MAGEA4. In preferred embodimentsthe composition comprises a reference marker that is an RNA selectedfrom the group consisting of CASC3, β-actin, U1 RNA and U6 RNA. Inparticularly preferred embodiments, the oligonucleotide is selected fromone or more of a capture oligonucleotide, a pair of nucleic acidprimers, a nucleic acid probe, and an invasive oligonucleotide.Preferably the composition comprises a nucleic acid probeoligonucleotide comprising a reporter molecule, e.g., a fluorophore,and/or a flap sequence.

In some embodiments, the composition further comprises one or morecomponents selected from the group consisting of reverse transcriptase,(e.g., MMLV reverse transcriptase), flap endonuclease, thermostable DNApolymerase, and a FRET cassette. In preferred embodiments, the DNApolymerase is a bacterial DNA polymerase.

Definitions

To facilitate an understanding of the present technology, a number ofterms and phrases are defined below. Additional definitions are setforth throughout the detailed description.

Throughout the specification and claims, the following terms take themeanings explicitly associated herein, unless the context clearlydictates otherwise. The phrase “in one embodiment” as used herein doesnot necessarily refer to the same embodiment, though it may.Furthermore, the phrase “in another embodiment” as used herein does notnecessarily refer to a different embodiment, although it may. Thus, asdescribed below, various embodiments of the invention may be readilycombined, without departing from the scope or spirit of the invention.

In addition, as used herein, the term “or” is an inclusive “or” operatorand is equivalent to the term “and/or” unless the context clearlydictates otherwise. The term “based on” is not exclusive and allows forbeing based on additional factors not described, unless the contextclearly dictates otherwise. In addition, throughout the specification,the meaning of “a”, “an”, and “the” include plural references. Themeaning of “in” includes “in” and “on.”

The transitional phrase “consisting essentially of” as used in claims inthe present application limits the scope of a claim to the specifiedmaterials or steps “and those that do not materially affect the basicand novel characteristic(s)” of the claimed invention, as discussed inIn re Herz, 537 F.2d 549, 551-52, 190 USPQ 461, 463 (CCPA 1976). Forexample, a composition “consisting essentially of” recited elements maycontain an unrecited contaminant at a level such that, though present,the contaminant does not alter the function of the recited compositionas compared to a pure composition, i.e., a composition “consisting of”the recited components.

As used herein, the “sensitivity” of a given marker (or set of markersused together) refers to the percentage of samples that report a markervalue (e.g., an expression marker) above a threshold value thatdistinguishes between neoplastic and non-neoplastic samples. In someembodiments, a positive is defined as a histology-confirmed neoplasiathat reports a marker value above a threshold value (e.g., the rangeassociated with disease), and a false negative is defined as ahistology-confirmed neoplasia that reports a marker value below thethreshold value (e.g., the range associated with no disease). The valueof sensitivity, therefore, reflects the probability that a measurementfor a given marker obtained from a known diseased sample will be in therange of disease-associated measurements. As defined here, the clinicalrelevance of the calculated sensitivity value represents an estimationof the probability that a given marker would detect the presence of aclinical condition when applied to a subject with that condition.

As used herein, the “specificity” of a given marker (or set of markersused together) refers to the percentage of non-neoplastic samples thatreport a marker value (e.g., an expression marker) below a thresholdvalue that distinguishes between neoplastic and non-neoplastic samples.In some embodiments, a negative is defined as a histology-confirmednon-neoplastic sample that reports a marker value below the thresholdvalue (e.g., the range associated with no disease) and a false positiveis defined as a histology-confirmed non-neoplastic sample that reports amarker value above the threshold value (e.g., the range associated withdisease). The value of specificity, therefore, reflects the probabilitythat a marker measurement for a given marker obtained from a knownnon-neoplastic sample will be in the range of non-disease associatedmeasurements. As defined here, the clinical relevance of the calculatedspecificity value represents an estimation of the probability that agiven marker would detect the absence of a clinical condition whenapplied to a patient without that condition.

The term “primer” refers to an oligonucleotide, whether occurringnaturally as, e.g., a nucleic acid fragment from a restriction digest,or produced synthetically, that is capable of acting as a point ofinitiation of synthesis when placed under conditions in which synthesisof a primer extension product that is complementary to a nucleic acidtemplate strand is induced, (e.g., in the presence of nucleotides and aninducing agent such as a DNA polymerase, and at a suitable temperatureand pH). The primer is preferably single stranded for maximum efficiencyin amplification, but may alternatively be double stranded. If doublestranded, the primer is first treated to separate its strands beforebeing used to prepare extension products. Preferably, the primer is anoligodeoxyribonucleotide. The primer must be sufficiently long to primethe synthesis of extension products in the presence of the inducingagent. The exact lengths of the primers will depend on many factors,including temperature, source of primer, and the use of the method.

The term “probe” refers to an oligonucleotide (e.g., a sequence ofnucleotides), whether occurring naturally as in a purified restrictiondigest or produced synthetically, recombinantly, or by PCRamplification, that is capable of hybridizing to another oligonucleotideof interest. A probe may be single-stranded or double-stranded. Probesare useful in the detection, identification, and isolation of particulargene sequences (e.g., a “capture probe”). It is contemplated that anyprobe used in the present invention may, in some embodiments, be labeledwith any “reporter molecule,” so that is detectable in any detectionsystem, including, but not limited to enzyme (e.g., ELISA, as well asenzyme-based histochemical assays), fluorescent, radioactive, andluminescent systems. It is not intended that the present invention belimited to any particular detection system or label.

The term “target,” as used herein refers to a nucleic acid sought to besorted out from other nucleic acids, e.g., by probe binding,amplification, isolation, capture, etc. For example, when used inreference to the polymerase chain reaction, “target” refers to theregion of nucleic acid bounded by the primers used for polymerase chainreaction, while when used in an assay in which target nucleic acid isnot amplified, e.g., in some embodiments of an invasive cleavage assay,a target comprises the site at which a probe and invasiveoligonucleotides (e.g., INVADER oligonucleotide) bind to form aninvasive cleavage structure, such that the presence of the targetnucleic acid can be detected. A “segment” is defined as a region ofnucleic acid within the target sequence.

The term “marker”, as used herein, refers to a substance (e.g., anucleic acid, or a region of a nucleic acid, or a protein) that may beused to distinguish non-normal cells (e.g., cancer cells) from normalcells, e.g., based on presence, absence, or status (e.g.,post-transcriptional processing) of the marker substance.

The term “sample” is used in its broadest sense. In one sense it canrefer to an animal cell or tissue. In another sense, it refers to aspecimen or culture obtained from any source, as well as biological andenvironmental samples. Biological samples may be obtained from plants oranimals (including humans) and encompass fluids, solids, tissues, andgases. Environmental samples include environmental material such assurface matter, soil, water, and industrial samples. These examples arenot to be construed as limiting the sample types applicable to thepresent invention.

The term “neoplasm” as used herein refers to any new and abnormal growthof tissue. Thus, a neoplasm can be a premalignant neoplasm or amalignant neoplasm.

The term “neoplasm-specific marker,” as used herein, refers to anybiological material or element that can be used to indicate the presenceof a neoplasm. Examples of biological materials include, withoutlimitation, nucleic acids (DNA, RNA, miRNA, etc.), polypeptides,carbohydrates, fatty acids, cellular components (e.g., cell membranesand mitochondria), and whole cells. In some instances, markers areparticular nucleic acid regions (e.g., genes, intragenic regions,specific loci, etc.). Regions of nucleic acid that are markers may bereferred to, e.g., as “marker genes,” “marker regions,” “markersequences,” “marker loci,” etc.

As used herein, the terms “patient” or “subject” refer to organisms tobe subject to various tests provided by the technology. The term“subject” includes animals, preferably mammals, including humans. In apreferred embodiment, the subject is a primate. In an even morepreferred embodiment, the subject is a human. Further with respect todiagnostic methods, a preferred subject is a vertebrate subject. Apreferred vertebrate is warm-blooded; a preferred warm-bloodedvertebrate is a mammal. A preferred mammal is most preferably a human.As used herein, the term “subject’ includes both human and animalsubjects. Thus, veterinary therapeutic uses are provided herein. Assuch, the present technology provides for the diagnosis of mammals suchas humans, as well as those mammals of importance due to beingendangered, such as Siberian tigers; of economic importance, such asanimals raised on farms for consumption by humans; and/or animals ofsocial importance to humans, such as animals kept as pets or in zoos.Examples of such animals include but are not limited to: carnivores suchas cats and dogs; swine, including pigs, hogs, and wild boars; ruminantsand/or ungulates such as cattle, oxen, sheep, giraffes, deer, goats,bison, and camels; pinnipeds; and horses. Thus, also provided is thediagnosis and treatment of livestock, including, but not limited to,domesticated swine, ruminants, ungulates, horses (including racehorses), and the like. The presently-disclosed subject matter furtherincludes a system for diagnosing a lung cancer in a subject. The systemcan be provided, for example, as a commercial kit that can be used toscreen for a risk of lung cancer or diagnose a lung cancer in a subjectfrom whom a biological sample has been collected. An exemplary systemprovided in accordance with the present technology includes assessingthe expression of a marker described herein.

The term “amplifying” or “amplification” in the context of nucleic acidsrefers to the production of multiple copies of a polynucleotide, or aportion of the polynucleotide, typically starting from a small amount ofthe polynucleotide (e.g., a single polynucleotide molecule), where theamplification products or amplicons are generally detectable.Amplification of polynucleotides encompasses a variety of chemical andenzymatic processes. The generation of multiple DNA copies from one or afew copies of a target or template DNA molecule during a polymerasechain reaction (PCR) or a ligase chain reaction (LCR; see, e.g., U.S.Pat. No. 5,494,810; herein incorporated by reference in its entirety)are forms of amplification. Additional types of amplification include,but are not limited to, allele-specific PCR (see, e.g., U.S. Pat. No.5,639,611; herein incorporated by reference in its entirety), assemblyPCR (see, e.g., U.S. Pat. No. 5,965,408; herein incorporated byreference in its entirety), helicase-dependent amplification (see, e.g.,U.S. Pat. No. 7,662,594; herein incorporated by reference in itsentirety), hot-start PCR (see, e.g., U.S. Pat. Nos. 5,773,258 and5,338,671; each herein incorporated by reference in their entireties),intersequence-specific PCR, inverse PCR (see, e.g., Triglia, et al.(1988) Nucleic Acids Res., 16:8186; herein incorporated by reference inits entirety), ligation-mediated PCR (see, e.g., Guilfoyle, R. et al.,Nucleic Acids Research, 25:1854-1858 (1997); U.S. Pat. No. 5,508,169;each of which are herein incorporated by reference in their entireties),miniprimer PCR, multiplex ligation-dependent probe amplification (see,e.g., Schouten, et al., (2002) Nucleic Acids Research 30(12): e57;herein incorporated by reference in its entirety), multiplex PCR (see,e.g., Chamberlain, et al., (1988) Nucleic Acids Research 16(23)11141-11156; Ballabio, et al., (1990) Human Genetics 84(6) 571-573;Hayden, et al., (2008) BMC Genetics 9:80; each of which are hereinincorporated by reference in their entireties), nested PCR,overlap-extension PCR (see, e.g., Higuchi, et al., (1988) Nucleic AcidsResearch 16(15) 7351-7367; herein incorporated by reference in itsentirety), real time PCR (see, e.g., Higuchi, et al., (1992)Biotechnology 10:413-417; Higuchi, et al., (1993) Biotechnology11:1026-1030; each of which are herein incorporated by reference intheir entireties), reverse transcription PCR (see, e.g., Bustin, S. A.(2000) J. Molecular Endocrinology 25:169-193; herein incorporated byreference in its entirety), solid phase PCR, thermal asymmetricinterlaced PCR, and Touchdown PCR (see, e.g., Don, et al., Nucleic AcidsResearch (1991) 19(14) 4008; Roux, K. (1994) Biotechniques 16(5)812-814; Hecker, et al., (1996) Biotechniques 20(3) 478-485; each ofwhich are herein incorporated by reference in their entireties).Polynucleotide amplification also can be accomplished using digital PCR(see, e.g., Kalinina, et al., Nucleic Acids Research. 25; 1999-2004,(1997); Vogelstein and Kinzler, Proc Natl Acad Sci USA. 96; 9236-41,(1999); International Patent Publication No. WO05023091A2; US PatentApplication Publication No. 20070202525; each of which are incorporatedherein by reference in their entireties).

The term “polymerase chain reaction” (“PCR”) refers to the method of K.B. Mullis U.S. Pat. Nos. 4,683,195, 4,683,202, and 4,965,188, thatdescribe a method for increasing the concentration of a segment of atarget sequence in a mixture of genomic or other DNA or RNA, withoutcloning or purification. This process for amplifying the target sequenceconsists of introducing a large excess of two oligonucleotide primers tothe DNA mixture containing the desired target sequence, followed by aprecise sequence of thermal cycling in the presence of a DNA polymerase.The two primers are complementary to their respective strands of thedouble stranded target sequence. To effect amplification, the mixture isdenatured and the primers then annealed to their complementary sequenceswithin the target molecule. Following annealing, the primers areextended with a polymerase so as to form a new pair of complementarystrands. The steps of denaturation, primer annealing, and polymeraseextension can be repeated many times (i.e., denaturation, annealing andextension constitute one “cycle”; there can be numerous “cycles”) toobtain a high concentration of an amplified segment of the desiredtarget sequence. The length of the amplified segment of the desiredtarget sequence is determined by the relative positions of the primerswith respect to each other, and therefore, this length is a controllableparameter. By virtue of the repeating aspect of the process, the methodis referred to as the “polymerase chain reaction” (“PCR”). Because thedesired amplified segments of the target sequence become the predominantsequences (in terms of concentration) in the mixture, they are said tobe “PCR amplified” and are “PCR products” or “amplicons.” Those of skillin the art will understand the term “PCR” encompasses many variants ofthe originally described method using, e.g., real time PCR, nested PCR,reverse transcription PCR (RT-PCR), single primer and arbitrarily primedPCR, etc.

As used herein, the term “nucleic acid detection assay” refers to anymethod of determining the nucleotide composition of a nucleic acid ofinterest. Nucleic acid detection assay include but are not limited to,DNA sequencing methods, probe hybridization methods, structure specificcleavage assays (e.g., the INVADER assay, (Hologic, Inc.) and aredescribed, e.g., in U.S. Pat. Nos. 5,846,717, 5,985,557, 5,994,069,6,001,567, 6,090,543, and 6,872,816, WO 2006/050499; Lyamichev et al.,Nat. Biotech., 17:292 (1999), Hall et al., PNAS, USA, 97:8272 (2000),and US 2009/0253142, each of which is herein incorporated by referencein its entirety for all purposes); enzyme mismatch cleavage methods(e.g., Variagenics, U.S. Pat. Nos. 6,110,684, 5,958,692, 5,851,770,herein incorporated by reference in their entireties); polymerase chainreaction (PCR), described above; branched hybridization methods (e.g.,Chiron, U.S. Pat. Nos. 5,849,481, 5,710,264, 5,124,246, and 5,624,802,herein incorporated by reference in their entireties); rolling circlereplication (e.g., U.S. Pat. Nos. 6,210,884, 6,183,960 and 6,235,502,herein incorporated by reference in their entireties); NASBA (e.g., U.S.Pat. No. 5,409,818, herein incorporated by reference in its entirety);molecular beacon technology (e.g., U.S. Pat. No. 6,150,097, hereinincorporated by reference in its entirety); E-sensor technology(Motorola, U.S. Pat. Nos. 6,248,229, 6,221,583, 6,013,170, and6,063,573, herein incorporated by reference in their entireties);cycling probe technology (e.g., U.S. Pat. Nos. 5,403,711, 5,011,769, and5,660,988, herein incorporated by reference in their entireties); DadeBehring signal amplification methods (e.g., U.S. Pat. Nos. 6,121,001,6,110,677, 5,914,230, 5,882,867, and 5,792,614, herein incorporated byreference in their entireties); ligase chain reaction (e.g., BaranayProc. Natl. Acad. Sci USA 88, 189-93 (1991)); and sandwich hybridizationmethods (e.g., U.S. Pat. No. 5,288,609, herein incorporated by referencein its entirety).

In some embodiments, target nucleic acid is amplified (e.g., by PCR) andamplified nucleic acid is detected simultaneously using an invasivecleavage assay. Assays configured for performing a detection assay(e.g., invasive cleavage assay) in combination with an amplificationassay are described in U.S. Pat. No. 9,096,893, incorporated herein byreference in its entirety for all purposes. Additional amplificationplus invasive cleavage detection configurations, termed the QuARTSmethod, are described in U.S. Pat. Nos. 8,361,720; 8,715,937; 8,916,344;and 9,212,392, incorporated herein by reference in their entireties forall purposes. The term “invasive cleavage structure” as used hereinrefers to a cleavage structure comprising i) a target nucleic acid, ii)an upstream nucleic acid (e.g., an invasive or “INVADER”oligonucleotide), and iii) a downstream nucleic acid (e.g., a probe),where the upstream and downstream nucleic acids anneal to contiguousregions of the target nucleic acid, and where an overlap forms betweenthe a 3′ portion of the upstream nucleic acid and duplex formed betweenthe downstream nucleic acid and the target nucleic acid. An overlapoccurs where one or more bases from the upstream and downstream nucleicacids occupy the same position with respect to a target nucleic acidbase, whether or not the overlapping base(s) of the upstream nucleicacid are complementary with the target nucleic acid, and whether or notthose bases are natural bases or non-natural bases. In some embodiments,the 3′ portion of the upstream nucleic acid that overlaps with thedownstream duplex is a non-base chemical moiety such as an aromatic ringstructure, e.g., as disclosed, for example, in U.S. Pat. No. 6,090,543,incorporated herein by reference in its entirety. In some embodiments,one or more of the nucleic acids may be attached to each other, e.g.,through a covalent linkage such as nucleic acid stem-loop, or through anon-nucleic acid chemical linkage (e.g., a multi-carbon chain). As usedherein, the term “flap endonuclease assay” includes “INVADER” invasivecleavage assays and QuARTS assays, as described above. The term “flapoligonucleotide” refers to an oligonucleotide cleavable in a detectionassay, such as an invasive cleavage assay, by a flap endonuclease. Inpreferred embodiments, a flap oligonucleotide forms an invasive cleavagestructure with other nucleic acids, e.g., a target nucleic acid and aninvasive oligonucleotide.

As used herein, the term “PCR-invasive cleavage assay” refers to anassay in which target nucleic acid is amplified and amplified nucleicacid is detected simultaneously using a signal-amplifying invasivecleavage assay employing a FRET cassette, and in which the assayreagents comprise a mixture containing DNA polymerase, FEN-1endonuclease, a primary probe comprising a portion complementary to atarget nucleic acid, and a hairpin FRET cassette. PCR-invasive cleavageassays include the QuARTS assays described in U.S. Pat. Nos. 8,361,720;8,715,937; 8,916,344; and 9,212,392, and the amplification assays ofU.S. Pat. No. 9,096,893, as diagrammed in FIG. 1 of that patent, each ofwhich is incorporated herein by reference for all purposes.

As used herein, the term “PCR-invasive cleavage assay reagents” refersto one or more reagents for detecting target sequences in a PCR-invasivecleavage assay, the reagents comprising nucleic acid molecules capableof participating in amplification of a target nucleic acid and information of an invasive cleavage structure in the presence of thetarget sequence, in a mixture containing DNA polymerase, FEN-1endonuclease and a FRET cassette, and optionally a reversetranscriptase.

As used herein, the term “FRET cassette” refers to a hairpinoligonucleotide that contains a fluorophore moiety and a nearby quenchermoiety that quenches the fluorophore. Hybridization of a cleaved flap(e.g., from cleavage of a target-specific probe in a PCR-invasivecleavage assay) with a FRET cassette produces a secondary substrate forthe flap endonuclease, e.g., a FEN-1 enzyme. Once this substrate isformed, the 5′ fluorophore-containing base is cleaved from the cassette,thereby generating a fluorescence signal. In preferred embodiments, aFRET cassette comprises an unpaired 3′ portion to which a cleavageproduct, e.g., a portion of a cleaved flap oligonucleotide, canhybridize to from an invasive cleavage structure cleavable by a FEN-1endonuclease.

A nucleic acid “hairpin” as used herein refers to a region of asingle-stranded nucleic acid that contains a duplex (i.e., base-paired)stem and a loop, formed when the nucleic acid comprises two portionsthat are sufficiently complementary to each other to form a plurality ofconsecutive base pairs.

As used herein, the term “FRET” refers to fluorescence resonance energytransfer, a process in which moieties (e.g., fluorophores) transferenergy e.g., among themselves, or, from a fluorophore to anon-fluorophore (e.g., a quencher molecule). In some circumstances, FRETinvolves an excited donor fluorophore transferring energy to alower-energy acceptor fluorophore via a short-range (e.g., about 10 nmor less) dipole-dipole interaction. In other circumstances, FRETinvolves a loss of fluorescence energy from a donor and an increase influorescence in an acceptor fluorophore. In still other forms of FRET,energy can be exchanged from an excited donor flurophore to anon-fluorescing molecule (e.g., a quenching molecule). FRET is known tothose of skill in the art and has been described (See, e.g., Stryer etal., 1978, Ann. Rev. Biochem., 47:819; Selvin, 1995, Methods Enzymol.,246:300; Orpana, 2004 Biomol Eng 21, 45-50; Olivier, 2005 Mutant Res573, 103-110, each of which is incorporated herein by reference in itsentirety).

As used herein, the term “FEN-1” in reference to an enzyme refers to anon-polymerase flap endonuclease from a eukaryote or archaeal organism.See, e.g., WO 02/070755, and Kaiser M. W., et al. (1999) J. Biol. Chem.,274:21387, which are incorporated by reference herein in theirentireties for all purposes.

As used herein, the term “FEN-1 activity” refers to any enzymaticactivity of a FEN-1 enzyme, including but not limited to flapendonuclease (FEN), nick exonuclease (EXO), and gap endonuclease (GEN)activities (see, e.g., Shen, et al., BioEssays Volume 27, Issue 7, Pages717-729, incorporated herein by reference).

As used herein, the term “primer annealing” refers to conditions thatpermit oligonucleotide primers to hybridize to template nucleic acidstrands. Conditions for primer annealing vary with the length andsequence of the primer and are generally based upon the T_(m) that isdetermined or calculated for the primer. For example, an annealing stepin an amplification method that involves thermocycling involves reducingthe temperature after a heat denaturation step to a temperature based onthe T_(m) of the primer sequence, for a time sufficient to permit suchannealing.

As used herein, the term “amplifiable nucleic acid” is used in referenceto nucleic acids that may be amplified by any amplification method. Itis contemplated that “amplifiable nucleic acid” will usually comprise“sample template.”

As used herein, the term “sample template” refers to nucleic acidoriginating from a sample that is analyzed for the presence of “target.”In contrast, “background template” is used in reference to nucleic acidother than sample template that may or may not be present in a sample.The presence of background template is most often inadvertent. It may bethe result of carryover, or it may be due to the presence of nucleicacid contaminants sought to be purified away from the sample. Forexample, nucleic acids from organisms other than those to be detectedmay be present as background in a test sample.

A sample “suspected of containing” a nucleic acid may contain or notcontain the target nucleic acid molecule.

The term “real time” as used herein in reference to detection of nucleicacid amplification or signal amplification refers to the detection ormeasurement of the accumulation of products or signal in the reactionwhile the reaction is in progress, e.g., during incubation or thermalcycling. Such detection or measurement may occur continuously, or it mayoccur at a plurality of discrete points during the progress of theamplification reaction, or it may be a combination. For example, in apolymerase chain reaction, detection (e.g., of fluorescence) may occurcontinuously during all or part of thermal cycling, or it may occurtransiently, at one or more points during one or more cycles. In someembodiments, real time detection of PCR is accomplished by determining alevel of fluorescence at the same point (e.g., a time point in thecycle, or temperature step in the cycle) in each of a plurality ofcycles, or in every cycle. Real time detection of amplification may alsobe referred to as detection “during” the amplification reaction.

As used herein, the terms “reverse transcription” and “reversetranscribe” refer to the use of a template-dependent polymerase toproduce a DNA strand complementary to an RNA template. A polymerasecapable of producing a DNA strand complementary to an RNA template isgenerally referred to as a “reverse transcriptase” or as a polymerasethat has “reverse transcriptase activity”.

As used herein, the term “abundance of nucleic acid” refers to theamount of a particular target nucleic acid sequence present in a sampleor aliquot. The amount is generally referred to in terms of mass (e.g.,μg), mass per unit of volume (e.g., μg/μ1); copy number (e.g., 1000copies, 1 attomole), or copy number per unit of volume (e.g., 1000copies per ml, 1 attomole per μ1). Abundance of a nucleic acid can alsobe expressed as an amount relative to the amount of a standard of knownconcentration or copy number. Measurement of abundance of a nucleic acidmay be on any basis understood by those of skill in the art as being asuitable quantitative representation of nucleic acid abundance,including physical density or the sample, optical density, refractiveproperty, staining properties, or on the basis of the intensity of adetectable label, e.g. a fluorescent label.

The term “amplicon” or “amplified product” refers to a segment ofnucleic acid, generally DNA, generated by an amplification process suchas the PCR process. The terms are also used in reference to RNA segmentsproduced by amplification methods that employ RNA polymerases, such asNASBA, TMA, etc.

The term “amplification plot” as used in reference to a thermal cyclingamplification reaction refers to the plot of signal that is indicativeof amplification, e.g., fluorescence signal, versus cycle number. Whenused in reference to a non-thermal cycling amplification method, anamplification plot generally refers to a plot of the accumulation ofsignal as a function of time.

The term “baseline” as used in reference to an amplification plot refersto the detected signal coming from assembled amplification reactions atprior to incubation or, in the case of PCR, in the initial cycles, inwhich there is little change in signal.

The term “no template control” and “no target control” (or “NTC”) asused herein in reference to a control reaction refers to a reaction orsample that does not contain template or target nucleic acid. It is usedto verify amplification quality.

As used herein, the term “quantitative amplification data set” refers tothe data obtained during quantitative amplification of the targetsample, e.g., target DNA. In the case of quantitative PCR or QuARTSassays, the quantitative amplification data set is a collection offluorescence values obtained at during amplification, e.g., during aplurality of, or all of the thermal cycles. Data for quantitativeamplification is not limited to data collected at any particular pointin a reaction, and fluorescence may be measured at a discrete point ineach cycle or continuously throughout each cycle.

The abbreviations “Ct” and “Cp” as used herein in reference to real-timedetection during an amplification reaction that is thermal cycled refersto the cycle at which signal (e.g., fluorescent signal) crosses apredetermined threshold value indicative of positive signal. Variousmethods have been used to calculate the threshold that is used as adeterminant of signal verses concentration, and the value is generallyexpressed as either the “crossing threshold” (Ct) or the “crossingpoint” (Cp). Either Cp values or Ct values may be used in embodiments ofthe methods presented herein for analysis of real-time signal for thedetermination of the amount of RNA marker(s) or reference markers in anassay or sample.

As used herein, the term “kit” refers to any delivery system fordelivering materials. In the context of reaction assays, such deliverysystems include systems that allow for the storage, transport, ordelivery of reaction reagents (e.g., oligonucleotides, enzymes, etc. inthe appropriate containers) and/or supporting materials (e.g., buffers,written instructions for performing the assay etc.) from one location toanother. For example, kits include one or more enclosures (e.g., boxes)containing the relevant reaction reagents and/or supporting materials.As used herein, the term “fragmented kit” refers to delivery systemscomprising two or more separate containers that each contain asubportion of the total kit components. The containers may be deliveredto the intended recipient together or separately. For example, a firstcontainer may contain an enzyme for use in an assay, while a secondcontainer contains oligonucleotides.

The term “system” as used herein refers to a collection of articles foruse for a particular purpose. In some embodiments, the articles compriseinstructions for use, as information supplied on e.g., an article, onpaper, or on recordable media (e.g., DVD, CD, flash drive, etc.). Insome embodiments, instructions direct a user to an online location,e.g., a website.

As used herein, the term “information” refers to any collection of factsor data. In reference to information stored or processed using acomputer system(s), including but not limited to internets, the termrefers to any data stored in any format (e.g., analog, digital, optical,etc.). As used herein, the term “information related to a subject”refers to facts or data pertaining to a subject (e.g., a human, plant,or animal). The term “genomic information” refers to informationpertaining to a genome including, but not limited to, nucleic acidsequences, genes, percentage methylation, allele frequencies, RNAexpression levels, protein expression, phenotypes correlating togenotypes, etc. “Allele frequency information” refers to facts or datapertaining to allele frequencies, including, but not limited to, alleleidentities, statistical correlations between the presence of an alleleand a characteristic of a subject (e.g., a human subject), the presenceor absence of an allele in an individual or population, the percentagelikelihood of an allele being present in an individual having one ormore particular characteristics, etc.

DESCRIPTION OF THE DRAWINGS

FIG. 1 shows a schematic diagram of a combined reversetranscription-QuARTS flap endonuclease detection assay for real-timedetection of RNA. Use of multiple different probe flap/FRET cassette dyecombinations allows multiple different target nucleic acids to bedetected together in multiplex reactions.

FIG. 2 compares the effects of different amounts of reversetranscriptase and different reverse transcription conditions on thedetection of known amounts of target RNA in RT-QuARTS assays.

FIG. 3 shows graphs showing standard curves measuring marker LRG1 RNA.Panel A describes the dilution series, the average Cp value at eachdilution, and the calculated strands/reaction calculated from theamplification plots shown in panel B. Panel C shows a graph comparingthe Cp compared to the log of the amount of RNA present in the sample.

FIG. 4 compares the signals measured for markers FAM83A, XAGE1D,CYP24A1, GAGE12D, LRG1, and MAGEA4 in cancer and normal tissue samples,as described below.

FIG. 5 shows graphs comparing the sensitivity and specificity whensamples are analyzed using the combinations of four or six expressionmarkers, as listed above each panel.

DETAILED DESCRIPTION OF THE INVENTION

Provided herein is technology relating to RNA expression markers for usein assays for detection and quantification of RNA. In particular, thetechnology relates to use of RNA-based gene expression assays to detectlung cancer.

In this detailed description of the various embodiments, for purposes ofexplanation, numerous specific details are set forth to provide athorough understanding of the embodiments disclosed. One skilled in theart will appreciate, however, that these various embodiments may bepracticed with or without these specific details. In other instances,structures and devices are shown in block diagram form. Furthermore, oneskilled in the art can readily appreciate that the specific sequences inwhich methods are presented and performed are illustrative and it iscontemplated that the sequences can be varied and still remain withinthe spirit and scope of the various embodiments disclosed herein.

The methods and compositions provided herein relate to characterizingthe expression from marker genes by characterizing RNA molecules (“RNAmarkers”) in a sample, wherein the RNA presence, absence, or status(e.g., with respect to post-transcription modifications or processing)is indicative of neoplasia. Accordingly, provided here are compositionsand method directed toward analysis of RNA markers that correlate withlung neoplasia. In preferred embodiments the technology provides assayswherein RNA markers are reverse transcribed, amplified, and detected inreal time in a single reaction mixture, and in a single vessel.

Also provided herein are compositions and kits for practicing themethods. For example, in some embodiments, reagents (e.g., primers,probes) specific for one or more RNA expression markers are providedalone or in sets (e.g., sets of primers pairs for amplifying a pluralityof RNA markers). Additional reagents for conducting a detection assaymay also be provided (e.g., enzymes, buffers, positive and negativecontrols for conducting QuARTS assays, RT-QuARTS assays, PCR,sequencing, or other assays). In some embodiments, the kits containingone or more reagent necessary, sufficient, or useful for conducting amethod are provided. Also provided are reactions mixtures containing thereagents. Further provided are master mix reagent sets containing aplurality of reagents that may be added to each other and/or to a testsample to complete a reaction mixture.

The technology relates to the analysis of any sample associated withlung cancer. For example, in some embodiments the sample comprises atissue and/or biological fluid obtained from a patient. In someembodiments, the sample comprises a secretion. In some embodiments, thesample comprises sputum, blood, serum, plasma, lung tissue samples, orlung cells. In some embodiments, the subject is human. Such samples canbe obtained by any number of means known in the art, such as will beapparent to the skilled person.

I. RNA Detection Assays to Detect Lung Cancer

Eight candidate mRNA markers (GAGE12D, FAM83A, LRG1, XAGE-1 d, MAGEA4,SFTPB, AKAP4, and CYP24A1) were selected based on discriminationreported in the literature. As described below, samples from 246patients (119 controls, 127 lung cancer cases) were tested. The lungcancer cases comprised adena (65), squamous (34), large cell (13), smallcell (4) and others carcinomas (11). The controls were from patientshaving benign lung nodules (37), normal lung (60), chronic obstructivepulmonary disorder (COPD) (10), and normal lung adjacent to tumor (12).Cases and controls included smokers and non-smokers.

Messenger RNA expression levels were assayed in a single-tube reversetranscription QuARTS (Quantitative Allele-Specific Real-time Target andSignal amplification) as described herein below, a reactionconfiguration that simultaneously measures copy numbers of two mRNAmarkers and a housekeeping reference mRNA (CASC3). To account forsample-to-sample variability, relative gene expression values of eachmRNA marker were calculated by dividing the copy numbers obtained foreach of the mRNAs by the CASC3 mRNA copy number.

Receiver operator characteristic (ROC) curve analyses resulted in anarea under the curve (AUC) of 0.976. At 100% specificity, the mRNA panelof 6 markers achieved a sensitivity of 92.1% for all cancers (117/127)and 93.9% for adenocarcinoma and squamous carcinoma combined (93/99).

II. RNA Detection Assays and Kits

The markers described herein find use in a variety of RNA expressionassays, e.g., qRT-PCR, digital PCR, gene expression arrays, etc. In someembodiments, a modified version of a quantitative real-time target andsignal amplification (QuARTS) assay is used to evaluate gene expression.In DNA detection, three reactions occur during each QuARTS assay,including amplification (reaction 1) and target probe cleavage (reaction2) in the primary reaction; and FRET cleavage and fluorescent signalgeneration (reaction 3) in the secondary reaction. After the first fewcycles generate initial amounts of cleaved probe, these reactions occuressentially concurrently. As modified herein, a reverse transcriptionstep is included to produce cDNA for QuARTS flap assay detection.

When target nucleic acid is amplified with specific primers, a specificdetection probe with a flap sequence loosely binds to the amplicon. Thepresence of the specific invasive oligonucleotide at the target bindingsite causes a 5′ nuclease, e.g., a FEN-1 endonuclease, to release theflap sequence by cutting between the detection probe and the flapsequence. The flap sequence is complementary to a non-hairpin portion ofa corresponding FRET cassette. Accordingly, the flap sequence functionsas an invasive oligonucleotide on the FRET cassette and effects acleavage between the FRET cassette fluorophore and a quencher, whichproduces a fluorescent signal. The cleavage reaction can cut multipleprobes per target and thus release multiple fluorophore per flap,providing exponential signal amplification. A QuARTS flap endonucleaseassay can detect multiple targets in a single reaction vessel, e.g., byusing FRET cassettes with different dyes.

Methods of isolating RNA from samples are known in the art. For example,RNA isolation methods may comprise one or more of organic extraction,ultrafiltration, hybrid capture, etc. In some embodiments, cells orlysed samples containing RNA may be added directly to assay reactionswithout purification.

In some embodiments, the sample comprises blood, serum, plasma, orsaliva. In some embodiments, the subject is human. Such samples can beobtained by any number of means known in the art, such as will beapparent to the skilled person. Cell free or substantially cell freesamples can be obtained by subjecting the sample to various techniquesknown to those of skill in the art which include, but are not limitedto, centrifugation and filtration. Although it is generally preferredthat no invasive techniques are used to obtain the sample, it still maybe preferable to obtain samples such as tissue homogenates, tissuesections, and biopsy specimens. The technology is not limited in themethods used to prepare the samples and provide a nucleic acid fortesting. For example, in some embodiments, a RNA is isolated from bloodor from a plasma sample using a hybrid capture method, e.g., usingtarget-specific binding materials (e.g., oligonucleotides) on solidsupports.

The analysis of markers can be carried out separately or simultaneouslywith additional markers within one test sample. For example, severalmarkers can be combined into one test for efficient processing ofmultiple samples and for potentially providing greater diagnostic and/orprognostic accuracy. In addition, one skilled in the art would recognizethe value of testing multiple samples (for example, at successive timepoints) from the same subject. Such testing of serial samples can allowthe identification of changes in marker expression over time. Changes inexpression, as well as the absence of change in expression, can provideuseful information about the disease status that includes, but is notlimited to, identifying the approximate time from onset of the event,the presence and amount of salvageable tissue, the appropriateness ofdrug therapies, the effectiveness of various therapies, andidentification of the subject's outcome, including risk of futureevents.

The analysis of biomarkers can be carried out in a variety of physicalformats. For example, the use of microtiter plates or automation can beused to facilitate the processing of large numbers of test samples.Alternatively, single sample formats could be developed to facilitateimmediate treatment and diagnosis in a timely fashion, for example, inambulatory transport or emergency room settings.

It is contemplated that embodiments of the technology are provided inthe form of a kit. The kits comprise embodiments of the compositions,devices, apparatuses, etc. described herein, and instructions for use ofthe kit. Such instructions describe appropriate methods for preparing ananalyte from a sample, e.g., for collecting a sample and preparing anucleic acid from the sample. Individual components of the kit arepackaged in appropriate containers and packaging (e.g., vials, boxes,blister packs, ampules, jars, bottles, tubes, and the like) and thecomponents are packaged together in an appropriate container (e.g., abox or boxes) for convenient storage, shipping, and/or use by the userof the kit. It is understood that liquid components (e.g., a buffer) maybe provided in a lyophilized form to be reconstituted by the user. Kitsmay include a control or reference for assessing, validating, and/orassuring the performance of the kit. For example, a kit for assaying theamount of a nucleic acid present in a sample may include a controlcomprising a known concentration of the same or another nucleic acid forcomparison and, in some embodiments, a detection reagent (e.g., aprimer) specific for the control nucleic acid. The kits are appropriatefor use in a clinical setting and, in some embodiments, for use in auser's home. The components of a kit, in some embodiments, provide thefunctionalities of a system for preparing a nucleic acid solution from asample. In some embodiments, certain components of the system areprovided by the user.

III. Applications

In some embodiments, diagnostic assays identify the presence of adisease or condition in an individual. In some embodiments, the diseaseis cancer (e.g., lung cancer). In preferred embodiments, markers whoseaberrant expression is associated with a lung cancer (e.g., one or moremarkers selected GAGE12D, FAM83A, LRG1, XAGE-1 d, MAGEA4, SFTPB, AKAP4,and CYP24A1) are used. In some embodiments, an assay further comprisesdetection of a reference nucleic acid (e.g., CASC3 or β-actin mRNAs; U1and U6 snRNAs, etc.).

In some embodiments, the technology finds application in treating apatient (e.g., a patient with lung cancer, with early stage lung cancer,or who may develop lung cancer), the method comprising determining theexpression levels of one or more markers as provided herein andadministering a treatment to the patient based on the results ofdetermining the expression levels. The treatment may be administrationof a pharmaceutical compound, a vaccine, performing a surgery, imagingthe patient, performing another test. Preferably, said use is in amethod of clinical screening, a method of prognosis assessment, a methodof monitoring the results of therapy, a method to identify patients mostlikely to respond to a particular therapeutic treatment, a method ofimaging a patient or subject, and a method for drug screening anddevelopment.

In some embodiments, the technology finds application in methods fordiagnosing lung cancer in a subject is provided. The terms “diagnosing”and “diagnosis” as used herein refer to methods by which the skilledartisan can estimate and even determine whether or not a subject issuffering from a given disease or condition or may develop a givendisease or condition in the future. The skilled artisan often makes adiagnosis on the basis of one or more diagnostic indicators, such as forexample a biomarker, the expression of which is indicative of thepresence, severity, or absence of the condition.

Along with diagnosis, clinical cancer prognosis relates to determiningthe aggressiveness of the cancer and the likelihood of tumor recurrenceto plan the most effective therapy. If a more accurate prognosis can bemade or even a potential risk for developing the cancer can be assessed,appropriate therapy, and in some instances less severe therapy for thepatient can be chosen. Assessment (e.g., analyzing expression) of cancerbiomarkers is useful to separate subjects with good prognosis and/or lowrisk of developing cancer who will need no therapy or limited therapyfrom those more likely to develop cancer or suffer a recurrence ofcancer who might benefit from more intensive treatments.

As such, “making a diagnosis” or “diagnosing”, as used herein, isfurther inclusive of making determining a risk of developing cancer ordetermining a prognosis, which can provide for predicting a clinicaloutcome (with or without medical treatment), selecting an appropriatetreatment (or whether treatment would be effective), or monitoring acurrent treatment and potentially changing the treatment, based on themeasure of the diagnostic biomarkers disclosed herein.

Further, in some embodiments of the technology, multiple determinationsof the biomarkers over time can be made to facilitate diagnosis and/orprognosis. A temporal change in the biomarker can be used to predict aclinical outcome, monitor the progression of lung cancer, and/or monitorthe efficacy of appropriate therapies directed against the cancer. Insuch an embodiment for example, one might expect to see a change in theexpression of one or more biomarkers disclosed herein (and potentiallyone or more additional biomarker(s), if monitored) in a biologicalsample over time during the course of an effective therapy.

The technology further finds application in methods for determiningwhether to initiate or continue prophylaxis or treatment of a cancer ina subject. In some embodiments, the method comprises providing a seriesof biological samples over a time period from the subject; analyzing theseries of biological samples to determine expression of at least onebiomarker disclosed herein in each of the biological samples; andcomparing any measurable change in the expression of one or more of thebiomarkers in each of the biological samples. Any changes in theexpression of biomarkers over the time period can be used to predictrisk of developing cancer, predict clinical outcome, determine whetherto initiate or continue the prophylaxis or therapy of the cancer, andwhether a current therapy is effectively treating the cancer. Forexample, a first time point can be selected prior to initiation of atreatment and a second time point can be selected at some time afterinitiation of the treatment. Expression can be measured in each of thesamples taken from different time points and qualitative and/orquantitative differences noted. A change in the expression of thebiomarkers from the different samples can be correlated with risk fordeveloping lung, prognosis, determining treatment efficacy, and/orprogression of the cancer in the subject.

In preferred embodiments, the methods and compositions of the inventionare for treatment or diagnosis of disease at an early stage, forexample, before symptoms of the disease appear. In some embodiments, themethods and compositions of the invention are for treatment or diagnosisof disease at a clinical stage.

As noted above, in some embodiments multiple determinations of one ormore diagnostic or prognostic biomarkers can be made, and a temporalchange in the marker can be used to determine a diagnosis or prognosis.For example, a diagnostic marker can be determined at an initial time,and again at a second time. In such embodiments, an increase in themarker from the initial time to the second time can be diagnostic of aparticular type or severity of cancer, or a given prognosis. Likewise, adecrease in the marker from the initial time to the second time can beindicative of a particular type or severity of cancer, or a givenprognosis. Furthermore, the degree of change of one or more markers canbe related to the severity of the cancer and future adverse events. Theskilled artisan will understand that, while in certain embodimentscomparative measurements can be made of the same biomarker at multipletime points, one can also measure a given biomarker at one time point,and a second biomarker at a second time point, and a comparison of thesemarkers can provide diagnostic information.

As used herein, the phrase “determining the prognosis” refers to methodsby which the skilled artisan can predict the course or outcome of acondition in a subject. The term “prognosis” does not refer to theability to predict the course or outcome of a condition with 100%accuracy, or even that a given course or outcome is predictably more orless likely to occur based on the expression of a biomarker. Instead,the skilled artisan will understand that the term “prognosis” refers toan increased probability that a certain course or outcome will occur;that is, that a course or outcome is more likely to occur in a subjectexhibiting a given condition, when compared to those individuals notexhibiting the condition. For example, in individuals not exhibiting thecondition, the chance of a given outcome (e.g., suffering from lungcancer) may be very low.

In some embodiments, a statistical analysis associates a prognosticindicator with a predisposition to an adverse outcome. For example, insome embodiments, an expression level different from that in a normalcontrol sample obtained from a patient who does not have a cancer cansignal that a subject is more likely to suffer from a cancer thansubjects with a level that is more similar to the expression level inthe control sample, as determined by a level of statisticalsignificance. Additionally, a change in expression level from a baseline(e.g., “normal”) level can be reflective of subject prognosis, and thedegree of change in expression can be related to the severity of adverseevents. Statistical significance is often determined by comparing two ormore populations and determining a confidence interval and/or a p value.See, e.g., Dowdy and Wearden, Statistics for Research, John Wiley &Sons, New York, 1983, incorporated herein by reference in its entirety.Exemplary confidence intervals of the present subject matter are 90%,95%, 97.5%, 98%, 99%, 99.5%, 99.9% and 99.99%, while exemplary p valuesare 0.1, 0.05, 0.025, 0.02, 0.01, 0.005, 0.001, and 0.0001.

In other embodiments, a threshold degree of change in expression of aprognostic or diagnostic biomarker disclosed herein can be established,and the degree of change in the expression of the biomarker in abiological sample is simply compared to the threshold degree of changein the expression. A preferred threshold change in the expression levelfor biomarkers provided herein is about 5%, about 10%, about 15%, about20%, about 25%, about 30%, about 50%, about 75%, about 100%, and about150%. In yet other embodiments, a “nomogram” can be established, bywhich expression of a prognostic or diagnostic indicator (biomarker orcombination of biomarkers) is directly related to an associateddisposition towards a given outcome. The skilled artisan is acquaintedwith the use of such nomograms to relate two numeric values with theunderstanding that the uncertainty in this measurement is the same asthe uncertainty in the marker concentration because individual samplemeasurements are referenced, not population averages.

In some embodiments, a control sample is analyzed concurrently with thebiological sample, such that the results obtained from the biologicalsample can be compared to the results obtained from the control sample.Additionally, it is contemplated that standard curves can be provided,with which assay results for the biological sample may be compared. Suchstandard curves present expression levels of a biomarker as a functionof assay units, e.g., fluorescent signal intensity, if a fluorescentlabel is used. Using samples taken from multiple donors, standard curvescan be provided for control expression of the one or more biomarkers innormal tissue, as well as for “at-risk” levels of the one or morebiomarkers in tissue taken from donors with lung cancer.

The analysis of markers can be carried out separately or simultaneouslywith additional markers within one test sample. For example, severalmarkers can be combined into one test for efficient processing of amultiple of samples and for potentially providing greater diagnosticand/or prognostic accuracy. In addition, one skilled in the art wouldrecognize the value of testing multiple samples (for example, atsuccessive time points) from the same subject. Such testing of serialsamples can allow the identification of changes in marker expressionover time. Changes in expression, as well as the absence of change inexpression, can provide useful information about the disease status thatincludes, but is not limited to, identifying the approximate time fromonset of the event, the presence and amount of salvageable tissue, theappropriateness of drug therapies, the effectiveness of varioustherapies, and identification of the subject's outcome, including riskof future events.

The analysis of biomarkers can be carried out in a variety of physicalformats. For example, the use of microtiter plates or automation can beused to facilitate the processing of large numbers of test samples.Alternatively, single sample formats could be developed to facilitateimmediate treatment and diagnosis in a timely fashion, for example, inambulatory transport or emergency room settings.

In some embodiments, the subject is diagnosed as having lung cancer if,when compared to a control expression, there is a measurable differencein the expression of at least one biomarker in the sample. Conversely,when no change in expression is identified in the biological sample, thesubject can be identified as not having lung cancer, not being at riskfor the cancer, or as having a low risk of the cancer. In this regard,subjects having lung cancer or risk thereof can be differentiated fromsubjects having low to substantially no cancer or risk thereof. Thosesubjects having a risk of developing lung cancer can be placed on a moreintensive and/or regular screening schedule. On the other hand, thosesubjects having low to substantially no risk may avoid being subjectedto screening procedures, until such time as a future screening, forexample, a screening conducted in accordance with the presenttechnology, indicates that a risk of lung cancer has appeared in thosesubjects.

As mentioned above, depending on the embodiment of the method of thepresent technology, detecting a change in expression of the one or morebiomarkers can be a qualitative determination or it can be aquantitative determination. As such, the step of diagnosing a subject ashaving, or at risk of developing, lung cancer indicates that certainthreshold measurements are made, e.g., the expression of the one or morebiomarkers in the biological sample varies from a predetermined controlexpression. In some embodiments of the method, the control expression isany detectable expression of the biomarker. In other embodiments of themethod where a control sample is tested concurrently with the biologicalsample, the predetermined expression is the expression in the controlsample. In other embodiments of the method, the predetermined expressionis based upon and/or identified by a standard curve. In otherembodiments of the method, the predetermined expression is aspecifically state or range of state. As such, the predeterminedexpression can be chosen, within acceptable limits that will be apparentto those skilled in the art, based in part on the embodiment of themethod being practiced and the desired specificity, etc.

Over recent years, it has become apparent that circulating epithelialcells, representing metastatic tumor cells, can be detected in the bloodof many patients with cancer. Molecular profiling of rare cells isimportant in biological and clinical studies. Applications range fromcharacterization of circulating epithelial cells (CEpCs) in theperipheral blood of cancer patients for disease prognosis andpersonalized treatment (See e.g., Cristofanilli M, et al. (2004) N EnglJ Med 351:781-791; Hayes D F, et al. (2006) Clin Cancer Res12:4218-4224; Budd G T, et al., (2006) Clin Cancer Res 12:6403-6409;Moreno J G, et al. (2005) Urology 65:713-718; Pantel et al., (2008) NatRev 8:329-340; and Cohen S J, et al. (2008) J Clin Oncol 26:3213-3221).Accordingly, embodiments of the present disclosure provide compositionsand methods for detecting the presence of metastatic cancer in a subjectby identifying the presence of expression markers in plasma or wholeblood.

EXPERIMENTAL EXAMPLES Tissue Extraction.

Tissue samples were obtained from various commercial and non-commercialsources (Asuragen, BioServe, ConversantBio, Cureline, Mayo Clinic, M DAnderson, and PrecisionMed). Tissue sections were examined by apathologist, who circled histologically distinct lesions to directcareful micro-dissection. Total nucleic acid extraction was performedusing the Promega Maxwell RSC system. FFPE slides were scraped andextracted using the Maxwell® RSC DNA FFPE Kit (#AS1450) using themanufacturer's procedure but skipping the RNase digestion step. The sameprocedure was used for FFPE bulk curls. For frozen punch biopsy samples,a modified procedure using the lysis buffer from the RSC DNA FFPE kitwith the Maxwell® RSC Blood DNA kit (#AS1400) was utilized omitting theRNase step. Prior to testing, samples were diluted 1:5 in 20 ng/μL tRNAin 10 mM TrisHCl, pH 8.0, 0.1 mM EDTA.

Gene Expression Markers

Gene expression markers tested comprised AKAP4GAGE12D, FAM83A, SFTPB(Pro-Surfactant B), XAGE-1D, CYP24A1, LRG1, and MAGEA4, and onereference gene expression were tested on lung cancer tissue samples.Expression of CASC3 was used as a reference marker.

Lung Tissue Samples

127 cancer tissue samples and 119 normal lung tissue samples weretested. The tissue types tested are summarized in the following tables:

Cancer Tissue Subtypes N Adenocarcinoma 65 Bronchioloalveolar 6 Largecell carcinoma 13 Neuroendocrine 2 Small cell carcinoma 4 Squamous cellcarcinoma 34 Unknown 3

Normal Tissue N Benign lung nodules 37 Adjacent normal tissue 72 COPDtissue 10

RT-QuARTS.

A QuARTS flap endonuclease assay reaction was modified to add a reversetranscription step. The assay probes were designed to span exonjunctions so that the RT-QuARTS assay would specifically detect mRNAtargets rather than the corresponding genomic loci. Briefly, thetechnique combines a reverse transcription step to convert the RNAtarget into a cDNA strand, a polymerase based target amplification and asimultaneous invasive cleavage signal amplification reaction (FIG. 1).The format results in a real-time accumulation of fluorescent signal inproportion to the amount of target mRNA. It produces a similar output toquantitative RT-PCR, but with the added sensitivity and specificityresulting from the addition of the invasive cleavage reaction. RT-QuARTSreactions comprising different amounts of Molony Murine Leukemia Virus(MMLV) reverse transcriptase and different dilutions of RNA wereconducted using a reverse transcription reactions for 10 to 45 minutes.FIG. 2 provides a table comparing the results of the different reactionconditions.

Each triplex RT-QuARTS assay as describe below consists of one mRNAtarget reporting to FAM, one to HEX, and the reference mRNA to Quasar670. Standard curves for each assay were generated by serially dilutingknown quantities of in vitro-produced transcripts for each marker.Standard curves were created by plotting Cp value by Log input strands.The resulting slope and intercept values were used to convert the Cpvalues of the unknown samples to mRNA strand values. Oligonucleotidesequences for the assays are shown in Table 1.

In vitro transcripts for each target were made from templates containingthe DNA sequence amplified in the QuARTS reaction with additionalflanking 5′ and 3′ sequences coupled to a T7 promoter. In vitrotranscription was done using the T7 Ribomax system (Promega) and theresulting transcripts were quantitated with the Quant-iT RNA assay kit(Thermo Fisher Scientific).

Without reverse transcription, an exemplary QuARTS reaction typicallycomprises approximately 400-600 nmol/1 (e.g., 500 nmol/1) of each primerand detection probe, approximately 100 nmol/1 of the invasiveoligonucleotide, approximately 600-700 nmol/1 of each FRET cassette(FAM, e.g., as supplied commercially by Hologic, Inc.; HEX, e.g., assupplied commercially by BioSearch Technologies; and Quasar 670, e.g.,as supplied commercially by BioSearch Technologies), 6.675 ng/μl FEN-1endonuclease (e.g., Cleavase® 2.0, Hologic, Inc.), 1 unit Taq DNApolymerase in a 30 μl reaction volume (e.g., GoTaq® DNA polymerase,Promega Corp., Madison, Wis.), 10 mmol/l 3-(n-morpholino)propanesulfonic acid (MOPS), 7.5 mmol/l MgCl₂, and 250 μmol/l of eachdNTP. Exemplary QuARTS cycling conditions are as shown in the tablebelow. In some applications, analysis of the quantification cycle(C_(g)) provides a measure of the initial number of target DNA strands(e.g., copy number) in the sample.

RT-QuARTS reactions contained 20 U of MMLV reverse transcriptase(MMLV-RT), 219 ng of Cleavase® 2.0, 1.5 U of GoTaq® DNA Polymerase, 200nM of each primer, 500 nM each of probe and FRET oligonucleotides, 10 mMMOPS buffer, pH7.5, 7.5 mM MgCl₂, and 250 μM each dNTP. Reactions wererun on a Roche LightCycler 480 system under the following conditions:42° C. for 30 minutes (RT reaction), 95° C. for 3 min, 10 cycles of 95°C. for 20 seconds, 63° C. for 30 sec, 70° C. for 30 sec, followed by 35cycles of 95° C. for 20 sec, 53° C. for 1 min, 70° C. for 30 sec, andhold at 40° C. for 30 sec.

RT-QuARTS with Multiplex Preamplification

In some embodiments, RT-QuARTS assays may comprise a step of multiplexpre-amplification, e.g., to pre-amplify 10, 12, or more targets in assample. Multiplex pre-amplification for QuARTS assays is described,e.g., in U.S. Patent Appln. Ser. No. 62/249,097, filed Oct. 30, 2015,and 62/332,295, filed May 5, 2016, each of which is incorporated hereinby reference.

An RT-pre-amplification is conducted in a reaction mixture containing,e.g., 20 U of MMLV reverse transcriptase, 1.5 U of GoTaq® DNAPolymerase, 10 mM MOPS buffer, pH7.5, 7.5 mM MgCl₂, 250 μM each dNTP,and oligonucleotide primers, (e.g., for 12 targets, 12 primer pairs/24primers, in equimolar amounts (e.g., 200 nM each primer), or withindividual primer concentrations adjusted to balance amplificationefficiencies of the different targets). Thermal cycling times andtemperatures are selected to be appropriate for the volume of thereaction and the amplification vessel. For example, the reactions may becycled as follows:

Stage Temp/Time # of Cycles RT 42° C./30′ 1 95° C./3′ 1 Amplification 195° C./20″ 10 63° C./30″ 70° C./30″ Cooling  4° C./Hold 1

After thermal cycling, aliquots of the pre-amplification reaction (e.g.,10 μL) are diluted to 500 μL in 10 mM Tris, 0.1 mM EDTA, with or withoutfish DNA. Aliquots of the diluted pre-amplified DNA (e.g., 10 μL) areused in QuARTS PCR-flap assay, e.g., as described above.

In some embodiments, DNA targets, e.g., methylated DNA marker genes,genes corresponding to the RNA marker, etc., may be amplified anddetected along with the reverse-transcribed cDNAs in a QuARTS assayreaction. In some embodiments, DNA and cDNA are co-amplified anddetected in a single-tube reaction, i.e., without the need to open thereaction vessel at any point between combining the reagents andcollecting the output data. In other embodiments, marker DNA from thesame sample or from a different sample, may be separately isolated, withor without a bisulfite conversion step, and may be combined with sampleRNA in an RT-QuARTS assay. In yet other embodiments, RNA and/or DNAsamples may be pre-amplified as described above.

The amplification primers and probes used for reverse transcription,amplification, and the flap endonuclease reactions that occur in theRT-QuARTS assay as described herein are shown in Table 1, below:

TABLE 1 AKAP4 Forward 5′-GGACACTGAGAAGAAAGACCAGTC Primer (SEQ ID NO: 1)Reverse 5′-GGGAGCTTGTTTGAAAAGGCA Primer (SEQ ID NO: 2) Probe5′-CCACGGACGCTAAGACAGAGG/3C6/ (SEQ ID NO: 3) CASC3 Forward5′-CTGCAACCACGGGAACTT Primer (SEQ ID NO: 4) Reverse5′-GAGGTGGAGGTCCTGCTC Primer (SEQ ID NO: 5) Probe5′-GACGCGGAGTCGAGGTATGCC/3C6/ (SEQ ID NO: 6) CYP24A1 Forward5′-CTTCAACTGCATTTGGCTCTTTG Primer (SEQ ID NO: 7) Reverse5′-TGTGGCCTGGATGTCGT Primer (SEQ ID NO: 8) Probe5′-CCACGGACGGTTGGATTGTCC/3C6/ (SEQ ID NO: 9) FA483A Forward5′-TGGAGATTTGTCCTGTCTGGATC Primer (SEQ ID NO: 10) Reverse5′-CTTGGAGAGGATGTTCCGGT Primer (SEQ ID NO: 11) Probe5′-CCACGGACGCTTACAGCTTCA/3C6/ (SEQ ID NO: 12) GAGE12D Forward5′-AGGGAGCATCTGCAGGTC Primer (SEQ ID NO: 13) Reverse5′-CCTGTTCCTGGCTATGAGCTTC Primer (SEQ ID NO: 14) Probe5′-CGCCGAGGCAAGGGCCGAAG/3C6/ (SEQ ID NO: 15) LRG1 Forward5′-GAGCAGACAGCGACCAAA Primer (SEQ ID NO: 16) Reverse5′-CAGGAACAGAGTTCTAGAAACATGG Primer (SEQ ID NO: 17) Probe5′-CCACGGACGAAAGCCCAGGGG/3C6/ (SEQ ID NO: 18) MAGEA4 Forward5′-AGAGGAGCACCAAGGAGAAGA Primer (SEQ ID NO: 19) Reverse5′-GGCAAAAGCTGGGCAATGG Primer (SEQ ID NO: 20) Probe5′-CGCCGAGGATCTGCCTGTGG/3C6/ (SEQ ID NO: 21) SFTPB Forward5′-GTCATCGACTACTTCCAGAACC Primer (SEQ ID NO: 22) Reverse5′-AGGTGCATACAGATGCCG Primer (SEQ ID NO: 23) Probe5′-CGCCGAGGCAGACTGACTCA/3C6/ (SEQ ID NO: 24) XAGE1D Forward5′-CCCAGGTGCTGGGAAGG Primer (SEQ ID NO: 25) Reverse5′-ACTGATGCAGCTCTTGCAGA Primer (SEQ ID NO: 26) Probe5′-CCACGGACGGGAAATGCGCGA/3C6/ (SEQ ID NO: 27)

FIG. 3 shows exemplary standard curves for LRG-1 RNA at dilutions A-E,i.e., 10 to 100,000 copies per reaction of input RNA, in the RT-QuARTSassay as described above. The average number of RNA strands present ascalculated from the fluorescence signal during amplification are shownunder “Calc. Strands/Rxn” on the right half of panel A. The graph inpanel C shows the fluorescence signal accumulation by cycle number forthe reactions having the different indicated amounts of input RNA.

RT-QuARTs Quantitative Data Analysis for Marker Detection

Strand values for individual markers from the samples were determined byusing the standard curves for each marker, as discussed above for theLRG-1 RNA. The strand numbers were divided by the CASC3 reference markerstrand numbers determined in the same assay well to normalize forvarying input RNA amounts. The resulting ratio was multiplied by 100 togenerate the “% MARKER” value for each mRNA as shown in FIG. 4.

Receiver operating characteristic (ROC) curves were generated fordifferent groupings of markers using JMP 11.0 software (SAS). Thepositive percent agreement (diagnostic sensitivity) was calculated bydividing the detected positives by the known lung cancer samples andmultiplying by 100, and the negative percent agreement (diagnosticspecificity) by dividing the detected negatives by the known normalcontrols and multiplying by 100.

FIG. 4 shows the signal measured from individual marker RNAs from cancerand normal samples. FIG. 5 shows the aggregate sensitivity andspecificity for samples analyzed using the indicated combinations ofmRNA markers.

Target mRNA sequences (showing T nucleotides in place of U nucleotides)are as follows:

AKAP4 (SEQ ID NO: 28) >gi|21493038|ref|NM_139289.1|Homo sapiens A kinase (PRKA)anchor protein 4 (AKAP4), transcript variant 2, mRNACAGGGGTGGCAGCCAACTGCAGGTGCCCAAGAACTTGGCACTTCTCAGTTCCATCTAAAGGGGCACATCTCCCTTCTGGGTGTCACGTTTTCAGCCAAACATCTAAAAGAACTTCATCATCAAGATGTCTGATGATATTGACTGGTTACGCAGCCACAGGGGTGTGTGCAAGGTAGATCTCTACAACCCAGAAGGACAGCAAGATCAGGACCGGAAAGTGATATGCTTTGTCGATGTGTCCACCCTGAATGTAGAAGATAAAGATTACAAGGATGCTGCTAGTTCCAGCTCAGAAGGCAACTTAAACCTGGGAAGTCTGGAAGAAAAAGAGATTATCGTGATCAAGGACACTGAGAAGAAAGACCAGTCTAAGACAGAGGGATCTGTATGCCTTTTCAAACAAGCTCCCTCTGATCCTGTAAGTGTCCTCAACTGGCTTCTCAGTGATCTCCAGAAGTATGCCTTGGGTTTCCAACATGCACTGAGCCCCTCAACCTCTACCTGTAAACATAAAGTAGGAGACACAGAGGGCGAATATCACAGAGCATCCTCTGAGAACTGCTACAGTGTCTATGCCGATCAAGTGAACATAGATTATTTGATGAACAGACCTCAAAACCTACGTCTAGAAATGACAGCAGCTAAAAACACCAACAATAATCAAAGTCCTTCAGCTCCTCCAGCCAAACCTCCTAGCACTCAGAGAGCAGTCATTTCCCCTGATGGAGAATGTTCTATAGATGACCTTTCCTTCTACGTCAACCGACTATCTTCTCTGGTAATCCAGATGGCCCATAAGGAAATCAAGGAGAAGTTGGAAGGTAAAAGCAAATGCCTTCATCATTCAATCTGTCCATCCCCTGGGAACAAAGAGAGAATCAGTCCCCGAACTCCTGCGAGCAAGATTGCTTCTGAAATGGCCTATGAAGCTGTGGAACTGACAGCTGCAGAAATGCGTGGCACTGGAGAGGAGTCCAGGGAAGGTGGCCAGAAAAGCTTTCTATATAGCGAATTATCCAACAAGAGCAAAAGTGGAGACAAACAGATGTCCCAGAGAGAGAGCAAAGAATTTGCAGATTCCATCAGCAAGGGGCTCATGGTTTATGCAAATCAGGTGGCATCTGACATGATGGTCTCTCTCATGAAGACCTTGAAAGTGCACAGCTCTGGGAAGCCAATTCCAGCATCTGTGGTCCTGAAGAGGGTGTTGCTAAGGCACACCAAGGAGATTGTGTCCGATTTGATTGATTCTTGCATGAAGAACCTGCATAATATTACTGGGGTCCTGATGACTGACTCAGACTTTGTCTCAGCTGTCAAGAGAAATCTGTTCAACCAGTGGAAACAAAATGCTACAGACATCATGGAGGCCATGCTGAAGCGCTTGGTCAGTGCCCTTATAGGTGAGGAGAAGGAGACTAAGTCTCAGAGTCTGTCATATGCATCTTTAAAAGCTGGGTCCCATGATCCCAAATGCAGGAATCAGAGTCTTGAATTCTCCACCATGAAAGCTGAAATGAAAGAGAGGGACAAAGGCAAAATGAAATCAGACCCATGCAAGTCACTGACTAGTGCTGAGAAAGTCGGTGAACACATTCTCAAAGAGGGCCTAACCATCTGGAACCAAAAGCAAGGAAACTCATGCAAGGTGGCTACCAAAGCATGCAGCAATAAAGATGAGAAAGGAGAAAAGATCAATGCTTCCACAGATTCACTGGCCAAGGACCTGATTGTCTCTGCCCTTAAGCTGATCCAGTACCATCTGACCCAGCAGACTAAGGGCAAAGATACATGTGAAGAAGACTGTCCTGGTTCCACCATGGGCTATATGGCTCAGAGTACTCAATATGAAAAGTGTGGAGGTGGCCAAAGTGCCAAAGCACTTTCAGTGAAACAACTAGAATCTCACAGAGCCCCTGGACCATCCACCTGTCAAAAGGAGAACCAACACCTGGACTCCCAGAAAATGGATATGTCAAACATCGTTCTAATGCTGATTCAGAAACTGCTTAATGAGAACCCCTTCAAATGTGAGGATCCATGCGAAGGTGAGAACAAGTGTTCTGAGCCCAGGGCAAGCAAAGCAGCTTCCATGTCCAACAGATCTGACAAAGCGGAAGAACAATGCCAGGAGCATCAAGAACTTGACTGTACCAGTGGGATGAAGCAAGCGAACGGGCAATTTATAGATAAACTAGTAGAATCTGTGATGAAGCTCTGCCTTATCATGGCTAAGTATAGCAACGATGGGGCAGCCCTTGCTGAGTTGGAAGAACAAGCAGCCTCGGCAAATAAGCCCAATTTCAGGGGCACCAGATGCATTCACAGTGGTGCAATGCCACAGAACTATCAAGACTCTCTTGGACATGAAGTAATTGTCAATAATCAGTGCTCTACAAATAGCTTGCAGAAGCAGCTCCAGGCTGTCCTGCAGTGGATTGCAGCCTCCCAGTTTAACGTGCCCATGCTCTACTTCATGGGAGATAAGGATGGACAACTGGAAAAGCTTCCTCAGGTTTCAGCTAAAGCAGCAGAGAAGGGGTACAGTGTAGGAGGTCTTCTTCAAGAGGTCATGAAGTTTGCCAAGGAACGGCAACCAGATGAAGCTGTGGGAAAGGTGGCCAGGAAACAGTTGCTGGACTGGCTGCTCGCTAACCTGTGAGCTGATCCTTGACTCCTCTTCATCTTAGCCCCCCTAGCAGCATTCCATCCCAGCCAGAGCACCCCCACCATCAGGCCAGTCAACTGCACAATACACAACTGTATTTCCCAATACACTTGAGCAGTTGCCTGTGAATGTAAGAGGTGTCAACAAACTGGGAAATAAAATAAAAAAAAATAATAAAAAAAAAAAAAAAAAAAAAAAA CASC3(SEQ ID NO: 29) >gi|102468569|ref|NM_007359.4| Homo sapiens cancersusceptibility candidate 3 (CASC3), mRNAAATCCGGGTCGGCCGCAAACGTGCCGCAGGCCTAGGCCCCGCCCAGTGCCCCGCCCCTCCCCCAACACACACACACACACACACACACACACACACCCCAACACACACACACACACCCCAACACACACACACACACACACACACACACACACACACACACACACACACACACACACAGCGGGATGGCCGAGCGCCGCACGCGTAGCACGCCGGGACTAGCTATCCAGCCTCCCAGCAGCCTCTGCGACGGGCGCGGTGCGTAAGTACCTCGCCGGTGGTGGCCGTTCTCCGTAAGATGGCGGACCGGCGGCGGCAGCGCGCTTCGCAAGACACCGAGGACGAGGAATCTGGTGCTTCGGGCTCCGACAGCGGCGGCTCCCCGTTGCGGGGAGGCGGGAGCTGCAGCGGTAGCGCCGGAGGCGGCGGCAGCGGCTCTCTGCCTTCACAGCGCGGAGGCCGAACCGGGGCCCTTCATCTGCGGCGGGTGGAGAGCGGGGGCGCCAAGAGTGCTGAGGAGTCGGAGTGTGAGAGTGAAGATGGCATTGAAGGTGATGCTGTTCTCTCGGATTATGAAAGTGCAGAAGACTCGGAAGGTGAAGAAGGTGAATACAGTGAAGAGGAAAACTCCAAAGTGGAGCTGAAATCAGAAGCTAATGATGCTGTTAATTCTTCAACAAAAGAAGAGAAGGGAGAAGAAAAGCCTGACACCAAAAGCACTGTGACTGGAGAGAGGCAAAGTGGGGACGGACAGGAGAGCACAGAGCCTGTGGAGAACAAAGTGGGTAAAAAGGGCCCTAAGCATTTGGATGATGATGAAGATCGGAAGAATCCAGCATACATACCTCGGAAAGGGCTCTTCTTTGAGCATGATCTTCGAGGGCAAACTCAGGAGGAGGAAGTCAGACCCAAGGGGCGTCAGCGAAAGCTATGGAAGGATGAGGGTCGCTGGGAGCATGACAAGTTCCGGGAAGATGAGCAGGCCCCAAAGTCCCGACAGGAGCTCATTGCTCTTTATGGTTATGACATTCGCTCAGCTCATAATCCTGATGACATCAAACCTCGAAGAATCCGGAAACCCCGATATGGGAGTCCTCCACAAAGAGATCCAAACTGGAACGGTGAGCGGCTAAACAAGTCTCATCGCCACCAGGGTCTTGGGGGCACCCTACCACCAAGGACATTTATTAACAGGAATGCTGCAGGTACCGGCCGTATGTCTGCACCCAGGAATTATTCTCGATCTGGGGGCTTCAAGGAAGGTCGTGCTGGTTTTAGGCCTGTGGAAGCTGGTGGGCAGCATGGTGGCCGGTCTGGTGAGACTGTTAAGCATGAGATTAGTTACCGGTCACGGCGCCTAGAGCAGACTTCTGTGAGGGATCCATCTCCAGAAGCAGATGCTCCAGTGCTTGGCAGTCCTGAGAAGGAAGAGGCAGCCTCAGAGCCACCAGCTGCTGCTCCTGATGCTGCACCACCACCCCCTGATAGGCCCATTGAGAAGAAATCCTATTCCCGGGCAAGAAGAACTCGAACCAAAGTTGGAGATGCAGTCAAGCTTGCAGAGGAGGTGCCCCCTCCTCCTGAAGGACTGATTCCAGCACCTCCAGTCCCAGAAACCACCCCAACTCCACCTACTAAGACTGGGACCTGGGAAGCTCCGGTGGATTCTAGTACAAGTGGACTTGAGCAAGATGTGGCACAACTAAATATAGCAGAACAGAATTGGAGTCCGGGGCAGCCTTCTTTCCTGCAACCACGGGAACTTCGAGGTATGCCCAACCATATACACATGGGAGCAGGACCTCCACCTCAGTTTAACCGGATGGAAGAAATGGGTGTCCAGGGTGGTCGAGCCAAACGCTATTCATCCCAGCGGCAAAGACCTGTGCCAGAGCCCCCCGCCCCTCCAGTGCATATCAGTATCATGGAGGGACATTACTATGATCCACTGCAGTTCCAGGGACCAATCTATACCCATGGTGACAGCCCTGCCCCGCTGCCTCCACAGGGCATGCTTGTGCAGCCAGGAATGAACCTTCCCCACCCAGGTTTACATCCCCACCAGACACCAGCTCCTCTGCCCAATCCAGGCCTCTATCCCCCACCAGTGTCCATGTCTCCAGGACAGCCACCACCTCAGCAGTTGCTTGCTCCTACTTACTTTTCTGCTCCAGGCGTCATGAACTTTGGTAATCCCAGTTACCCTTATGCTCCAGGGGCACTGCCTCCCCCACCACCGCCTCATCTGTATCCTAATACACAGGCCCCATCACAGGTATATGGAGGAGTGACCTACTATAACCCCGCCCAGCAGCAGGTGCAGCCAAAGCCCTCCCCACCCCGGAGGACTCCCCAGCCAGTCACCATCAAGCCCCCTCCACCTGAGGTTGTAAGCAGGGGTTCCAGTTAATACAAGTTTCTGAATATTTTAAATCTTAACATCATATAAAAAGCAGCAGAGGTGAGAACTCAGAAGAGAAATACAGCTGGCTATCTACTACCAGAAGGGCTTCAAAGATATAGGGTGTGGCTCCTACCAGCAAACAGCTGAAAGAGGAGGACCCCTGCCTTCCTCTGAGGACAGGCTCTAGAGAGAGGGAGAAACAAGTGGACCTCGTCCCATCTTCACTCTTCACTTGAGTTGGCTGTGTTCGGGGGAGCAGAGAGAGCCAGACAGCCCCAAGCTTCTGAGTCTAGATACAGAAGCCCATGTCTTCTGCTGTTCTTCACTTCTGGGAAATTGAAGTGTCTTCTGTTCCCAAGGAAGCTCCTTCCTGTTTGTTTTGTTTTCTAAGATGTTCATTTTTAAAGCCTGGCTTCTTATCCTTAATATTATTTTAATTTTTTCTCTTTGTTTCTGTTTCTTGCTCTCTCTCCCTGCCTTTAAATGAAACAAGTCTAGTCTTCTGGTTTTCTAGCCCCTCTGGATTCCCTTTTGACTCTTCCGTGCATCCCAGATAATGGAGAATGTATCAGCCAGCCTTCCCCACCAAGTCTAAAAAGACCTGGCCTTTCACTTTTAGTTGGCATTTGTTATCCTCTTGTATACTTGTATTCCCTTAACTCTAACCCTGTGGAAGCATGGCTGTCTGCACAGAGGGTCCCATTGTGCAGAAAAGCTCAGAGTAGGTGGGTAGGAGCCCTTCTCTTTGACTTAGGTTTTTAGGAGTCTGAGCATCCATCAATACCTGTACTATGATGGGCTTCTGTTCTCTGCTGAGGGCCAATACCCTACTGTGGGGAGAGATGGCACACCAGATGCTTTTGTGAGAAAGGGATGGTGGAGTGAGAGCCTTTGCCTTTAGGGGTGTGTATTCACATAGTCCTCAGGGCTCAGTCTTTTGAGGTAAGTGGAATTAGAGGGCCTTGCTTCTCTTCTTTCCATTCTTCTTGCTACACCCCTTTTCCAGTTGCTGTGGACCAATGCATCTCTTTAAAGGCAAATATTATCCAGCAAGCAGTCTACCCTGTCCTTTGCAATTGCTCTTCTCCACGTCTTTCCTGCTACAAGTGTTTTAGATGTTACTACCTTATTTTCCCCGAATTCTATTTTTGTCCTTGCAGACAGAATATAAAAACTCCTGGGCTTAAGGCCTAAGGAAGCCAGTCACCTTCTGGGCAAGGGCTCCTATCTTTCCTCCCTATCCATGGCACTAAACCACTTCTCTGCTGCCTCTGTGGAAGAGATTCCTATTACTGCAGTACATACGTCTGCCAGGGGTAACCTGGCCACTGTCCCTGTCCTTCTACAGAACCTGAGGGCAAAGATGGTGGCTGTGTCTCTCCCCGGTAATGTCACTGTTTTTATTCCTTCCATCTAGCAGCTGGCCTAATCACTCTGAGTCACAGGTGTGGGATGGAGAGTGGGGAGAGGCACTTAATCTGTAACCCCCAAGGAGGAAATAACTAAGAGATTCTTCTAGGGGTAGCTGGTGGTTGTGCCTTTTGTAGGCTGTTCCCTTTGCCTTAAACCTGAAGATGTCTCCTCAAGCCTGTGGGCAGCATGCCCAGATTCCCAGACCTTAAGACACTGTGAGAGTTGTCTCTGTTGGTCCACTGTGTTTAGTTGCAAGGATTTTTCCATGTGTGGTGGTGTTTTTTGTTACTGTTTTAAAGGGTGCCCATTTGTGATCAGCATTGTGACTTGGAGATAATAAAATTTAGACTATAACTTGGCTCCCTAAAAAAAAAAAAAAAAAAA CYP24A1(SEQ ID NO: 30) >gi|193083115|ref|NM_000782.4|Homo sapiens cytochrome P450,family 24, subfamily A, polypeptide 1 (CYP24A1), transcriptvariant 1, mRNAGACAGGAGGAAACGCAGCGCCAGCAGCATCTCATCTACCCTCCTTGACACCTCCCCGTGGCTCCAGCCAGACCCTAGAGGTCAGCCTTGCGGACCAACAGGAGGACTCCCAGCTTTCCCTTTTCAAGAGGTCCCCAGACACCGGCCACCCTCTTCCAGCCCCTGCGGCCAGTGCAAGGAGGCACCAATGCTCTGAGGCTGTCGCGTGGTGCAGCGTCGAGCATCCTCGCCGAGGTCCTTTCTGCTGCCTGTCCCGCCTCACCCCGCTCCATCACACCAGCTGGCCCTCTTTGCTTCCTTTTCCCAGAATCGTTAAGCCCCGACTCCCACTAGCACCTCGTACCAACCTCGCCCCACCCCATCCTCCTGCCTTCCCGCGCTCCGGTGTCCCCCGCTGCCATGAGCTCCCCCATCAGCAAGAGCCGCTCGCTTGCCGCCTTCCTGCAGCAGCTGCGCAGTCCGAGGCAGCCCCCGAGACTGGTGACATCTACGGCGTACACGTCCCCTCAGCCGCGAGAGGTGCCAGTCTGCCCGCTGACAGCTGGTGGCGAGACTCAGAACGCGGCCGCCCTGCCGGGCCCCACCAGCTGGCCACTGCTGGGCAGCCTGCTGCAGATTCTCTGGAAAGGGGGTCTCAAGAAACAGCACGACACCCTGGTGGAGTACCACAAGAAGTATGGCAAGATTTTCCGCATGAAGTTGGGTTCCTTTGAGTCGGTGCACCTGGGCTCGCCATGCCTGCTGGAAGCGCTGTACCGCACCGAGAGCGCGTACCCGCAGCGGCTGGAGATCAAACCGTGGAAGGCCTATCGCGACTACCGCAAAGAAGGCTACGGGCTGCTGATCCTGGAAGGGGAAGACTGGCAGCGGGTCCGGAGTGCCTTTCAAAAGAAACTAATGAAACCAGGGGAAGTGATGAAGCTGGACAACAAAATCAATGAGGTCTTGGCCGATTTTATGGGCAGAATAGATGAGCTCTGTGATGAAAGAGGCCACGTTGAAGACTTGTACAGCGAACTGAACAAATGGTCGTTTGAAAGTATCTGCCTCGTGTTGTATGAGAAGAGATTTGGGCTTCTCCAGAAGAATGCAGGGGATGAAGCTGTGAACTTCATCATGGCCATCAAAACAATGATGAGCACGTTTGGGAGGATGATGGTCACTCCAGTCGAGCTGCACAAGAGCCTCAACACCAAGGTCTGGCAGGACCACACTCTGGCCTGGGACACCATTTTCAAATCAGTCAAAGCTTGTATCGACAACCGGTTAGAGAAGTATTCTCAGCAGCCTAGTGCAGATTTCCTTTGTGACATTTATCACCAGAATCGGCTTTCAAAGAAAGAATTGTATGCTGCTGTCACAGAGCTCCAGCTGGCTGCGGTGGAAACGACAGCAAACAGTCTAATGTGGATTCTCTACAATTTATCCCGTAATCCCCAAGTGCAACAAAAGCTTCTTAAGGAAATTCAAAGTGTATTACCTGAGAATCAGGTGCCACGGGCAGAAGATTTGAGGAATATGCCGTATTTAAAAGCCTGTCTGAAAGAATCTATGAGGCTTACGCCGAGTGTACCATTTACAACTCGGACTCTTGACAAGGCAACAGTTCTGGGTGAATATGCTTTACCCAAAGGAACAGTGCTCATGCTAAATACCCAGGTGTTGGGATCCAGTGAAGACAATTTTGAAGATTCAAGTCAGTTTAGACCTGAACGTTGGCTTCAGGAGAAGGAAAAAATTAATCCTTTTGCGCATCTTCCATTTGGCGTTGGAAAAAGAATGTGCATTGGTCGCCGATTAGCAGAGCTTCAACTGCATTTGGCTCTTTGTTGGATTGTCCGCAAATACGACATCCAGGCCACAGACAATGAGCCTGTTGAGATGCTACACTCAGGCACCCTGGTGCCCAGCCGGGAACTCCCCATCGCGTTTTGCCAGCGATAATACGCCTCAGATGGTGGTATTTGCTAACATCATATCCAACTCAGGGAAGCGGACTGAGTGCTGGGATCCAAGGCATTCTACAGGGTTCACTGCTGGTTTACACTTCACCTGTGTCAGCACCATCTTCAGGTGCTTAGAATGGCCTGGGAGCCTGTTCTGTCTTGCATCTTCCATGACATGAAAGGGAGGCTGGCACTTGTCAGTCAGGTAGAGGTTACAAACCGTTTCAGGCCCTGCCTACCACATTCACTGTTTGAATCTTTAATTCCCAAGAATAAGTTTACATTTCACAATGAATGACCTACAACAGCTAAATTTTCTGGGGCTGGGAGTAATACTGACAATCCATTTACTGTAGCTCTGCTTAATGTACTACTTAGGAAAATGTCCCTGCTTAATAATGTAAGCCAAGCTAAATGATGGTTAAAGTTATCAGGCCTCCCATGAAATTGCGTTCTTCCTGCATTGAAATAAAAACATTATTGGGAAACTAGAGAACACCTCTATTTTTAAAAGGACTTTAACGAAGTCAAACAACTTATAAGACTAGTGATTCACTGGGGCATTATTTTGTTAGAGGACCTTAAAATTGTTTATTTTTTAAATGTGATTCCTTTATGGCATTAGGGTAAAGATGAAGCAATAATTTTTAAATTGTGTATGTGCATATGAAGCACAGACATGCATGTGTGTGTGTGTCTGTGTGTGTGTGTCCGTGTATGTGTGTGTGGGTTCTAATGGTAATTTGCCTCAGTCATTTTTTTAATATTTGCAGTACTTGATTTAGGATCTGTGGTGCAGGGCAATGTTTCAAAGTTTAGTCACAGCTTAAAAACATTCAGTGTGACTTTAATATTATAAAATGATTTCCCATGCCATAATTTTTCTGTCTATTAAATGGGACAAGTGTAAAGCATGCAAAAGTTAGAGATCTGTTATATAACATTTGTTTTGTGATTTGAACTCCTAGGAAAAATATGATTTCATAAATGTAAAATGCACAGAAATGCATGCAATACTTATAAGACTTAAAAATTGTGTTTACAGATGGTTTATTTGTGCATATTTTTACTACTGCTTTTCCTAAATGCATACTGTATATAATTCTGTGTATTTGATAAATATTTCTTCCTACATTATATTTTTAGAATATTTCAGAAATATACATTTATGTCTTTATATTGTAATAAATATGTACATATCTAGGTATATGCTTTCTCTCTGCTGTGAAATTATTTTTAGAATTATAAATTCACGTCTTGTCAGATTTCATCTGTATACCTTCAAATTCTCTGAAAGTAAAAATAAAAGTTTTTAAATATTAAAAAAAAAAAAAAAAAAAAA FAM83A(SEQ ID NO: 31) >gi|767953716|ref|XM_005251087.2|PREDICTED: Homo sapiens familywith sequence similarity 83, member A (FAM83A), transcriptvariant X1, mRNAAGGAAATATCCCATGGCTGACTGTGCCAAGGAGGTGTCTGAGCCAGCCCTCCCGGCCCGAGGGCAGGGCAGGTGGCCCTGAGAGATAAGCCAATCCCGCAGCTGCAGATGAGGAGTTCTGAGAAGCATTGCTCAGGACAGCGGTAAATCACTTCTTGGAGGTGCCCTGCACGCCGGTCCTGGGAGCAGGCGGCCTCCCGGGGGTGCGGGAGCCCCACTCCTCCGTGGTGTGTTCCATTTGCTTCCCACATCTGGAGGAGCTGACGTGCCAGCCTCCCCCAGCACCACCCAGGGACGGGAGGCATGAGCCGGTCAAGGCACCTGGGCAAAATCCGGAAGCGTCTGGAAGATGTCAAGAGCCAGTGGGTCCGGCCAGCCAGGGCTGACTTTAGTGACAACGAGAGTGCCCGGCTGGCCACGGACGCCCTCTTGGATGGGGGTTCTGAAGCCTACTGGCGGGTGCTCAGCCAGGAAGGCGAGGTGGACTTCTTGTCCTCGGTGGAGGCCCAGTACATCCAGGCCCAGGCCAGGGAGCCCCCGTGTCCCCCAGACACCCTGGGAGGGGCGGAAGCAGGCCCTAAGGGACTGGACTCCAGCTCCCTACAGTCCGGCACCTACTTCCCTGTGGCCTCAGAGGGCAGCGAGCCGGCCCTACTGCACAGCTGGGCCTCAGCTGAGAAGCCCTACCTGAAGGAAAAATCCAGCGCCACTGTGTACTTCCAGACCGTCAAGCACAACAACATCAGAGACCTCGTCCGCCGCTGCATCACCCGGACTAGCCAGAACATTTCCATCCGGAGTGTGGAAGGAGAGATATACTGTGCCAAGTCAGGCAGGAAATTCGCTGGCCAAATCCGGGAGAAGTTCATCATCTCGGACTGGAGATTTGTCCTGTCTGGATCTTACAGCTTCACCTGGCTCTGCGGACACGTGCACCGGAACATCCTCTCCAAGTTCACAGGCCAGGCGGTGGAGCTGTTTGACGAGGAGTTCCGCCACCTCTACGCCTCCTCCAAGCCTGTGATGGGCCTGAAGTCCCCGCGGCTGGTCGCCCCCGTCCCGCCCGGAGCAGCCCCGGCCAATGGCCGCCTTAGCAGCAGCAGTGGCTCCGCCAGTGACCGCACGTCCTCCAACCCCTTCAGCGGCCGCTCGGCAGGCAGCCACCCCGGTACCCGAAGTGTGTCCGCGTCTTCAGGGCCCTGTAGCCCCGCGGCCCCACACCCGCCTCCACCGCCCCGGTTCCAGCCCCACCAAGGCCCTTGGGGAGCCCCGAGTCCCCAGGCCCACCTCTCCCCGCGGCCCCACGACGGCCCGCCCGCCGCTGTCTACAGCAACCTGGGGGCCTACAGGCCCACGCGGCTGCAGCTGGAGCAGCTGGGCCTGGTGCCGAGGCTGACTCCAACCTGGAGGCCCTTCCTGCAGGCCTCCCCTCACTTCTGAAGGTCCCATCCCCTGCTGCCCTCCGCAGGCCCAGGGCTGGGCACTCCCTGAGACCCAAAGACCCACCTCAACGACGAGTGGCGTTGAGCCACTTCCCTTTGAAAAGACACTCAAAATCACTGCCATGGTTCAATGTTCCCAGGCCCCAGGCCATCCACTTGCCGGCCCCCACCAGTTCTTGGGTTCCCCGCTCTAGTTTGACCTGTGCAGCACATTCCAGAAGGTTCCAGGGAGGTTGTGGGGCAGCTAGAGGACAAAATCATGAAAACAGAGTCCCTGTCTTCCAGAGATCATCCGGGGCTTTAATATTAATGGCCCCCAAAACTCCGTAAGAAGCAGGAAATGCAGCCCAAGTTTTACAAATGGGTAAACAGAGGCACTGAGAGATAGATGGTAGTTTGGTACTTCTGGTTCCCAGTGCCCAGGAATGGTCCACTCCCAAGAAATTCAGGAAAGAAAGACTGAGGAGAAGGTGTGGGAACATTCTGGATGTTTCGGGAGAGTTGGGGAAACTCCTCCTCTTAGGAAAGGCTAATACTAGGGTATCCTTGGGCCCAATGAATTAGGGGTGAGGCCCCAGAACCCGTTATCTATGAGTTGTATGGGGGAGCCATCTGAAGCTGTAGCCACCAGGGATGCAGCTAGCTGAGGAGTTTGGGGTGTTGGGTTGGACAAGGCAGGTTAGTAGACTCAGATTCTTGCTTCAAAGAGCCTTGGGCTGGCCTGGAGGTCCCTGGAGTCTAGACTGGACCTAGGAGCTTGAGTTGTCAGGGGCCAGGACTGGCCCCACTGCAGTGCCCAGGCCAGTCTTGAGCAGCAGGGAGGGCTCAGCTGTCCCCAGATCCAGGTGCCTCTGACCAGCCTGGTCACCTCCTGAGGAATAAATGCTGAACCTCACAAGCCCCATCATTCATTTCTTCTCAATTCACAGTGCCCCTCTTTGTTTCTGGGGTGGAACTAGGTCCTGAGGGCACAGCCTAGCTGAGTGCAAAGAAATATAGGATGCTTAGAAAGCATACAGGAGGGGCCAGGCGTGGTGGCTCATGCCTGTAATCCCAGAACTTTGGGATGCCAAGGTGGTTGGATTACCTGAGATCAGGTGGATTACCTGGTCTCGAGACCAGCCTGACCAATATGGTGAAACCCCGTCTCTACTAAAAATACAAAAATTAGGCTGAGACAGGAGAATTGCTTGAACCCAGGAAGCAGAGGTTGCAATGAGCTGAGATTGCATCACTGCACTCCAGCATGGGCAACAAAGCAAGACTCCGTCACAGAAAAAAAAAAAAAAAAGAGAGAGAGCATAGAGGAGGGTTGGCCAGCCCTGTGGTGGGTGGGATGTCAGAGACACTTCCCAGATAAAGTAAGAGTTAACCCTGCACCTCAGGTGTGATAGTGGGGTCAGTGGTATGTGATCCAGGCTGGGGAGCCAGAGGGGAGCAGGTGCCAACTCCACATCCTTCTCCTGTTTCTAGGCCCTCTCCTCCCTTGTCGGTTTTTGGCGGGGAAGCTCAGCCTTCGCTGTGGAGGGACGAGAGCACAGAGCTCTTCCTCCTGGTGGCCTCTGACCCCTGACGGCCTGTGGCATCCTCCCTAGTCCCCTCTGCCCATCCATCCCTCTGTTCCAATTCTCCACTGCTCCCAGCATGATCTGGGGCATCTTGGCTTCTGGTTTCTTTTATTATTATTATTATTATTAATTATTGTATTCCTGTCCTTCACTTTTTTCCTCCTTAGTTCCTGAAAGTAAACAAAACAAAACAAAAACAAAAAAACAAACAACACTTTGGTTCCTGATGGCTTTCTGAACCCAGCCCTGACCTTGTTGTTTCACAGCTGACGGCTGAGATGAGGTTAGAATGACTGGGCCCGGCTGAACATTCCAAATTGGATTTCACCATCTGCTGAGAAAGTTTAAGGAAGGCAAAGCTTGCCAGGTCACAGAAGCTCCCAAGCCCAGCTTTCCAAAGGCCTCAGCCTGTGCCTGTGTCGAGCTCAGTCCTGGGAGATAGGGGAGAACCTGCAGGCAGGAACAAGCCCCCCTACTCCTGACCACCCTCCATCAGCAGTCTCCCCTCCGTGGTCGTCTTTGTTGACAAAGGTGCAGTTTCTCCTCTCCTGGGCACCTGTAACATGTGATGCGCTGCCTGCTGGGAGGTTAGGTCGGGGCTGCCCCGGCGAGTGGAGCATGAGCAGAACCGCCGAGGGTCACTTCTGGGCAGAAGCTTTGAGAGCCTGGGTCCAGGTTGCCACATAGAAGCAGCTCTCCAGTTGAAACCCTCCTCTGCCAGCCTGGGGTCCTAAGCGATGAGCAGAATCCCCCACTCCCACCCCACCAACCCACAATGGATATGTAGTGAGCAAGAAATAAACCTTTGTTGTTTAAGCCA GAGE12D(SEQ ID NO: 32) gi|187608822|ref|NM_001127199.1|Homo sapiens G antigen 12D (GAGE12D), mRNAGTTCACTGGGCGTCTTCTGCCCGGCCCCTTCGCCCACGTGAAGAACGCCAGGGAGCTGTGAGGCAGTGCTGTGTGGTTCCTGCCGTCCGGACTCTTTTTCCTCTACTGAGATTCATCTGTGTGAAATATGAGTTGGCGAGGAAGATCGACCTATTATTGGCCTAGACCAAGGCGCTATGTACAGCCTCCTGAAATGATTGGGCCTATGCGGCCCGAGCAGTTCAGTGATGAAGTGGAACCAGCAACACCTGAAGAAGGGGAACCAGCAACTCAATGTCAGGATCCTGCAGCTGCTCAGGAGGGAGAGGATGAGGGAGCATCTGCAGGTCAAGGGCCGAAGCCTGAAGCTCATAGCCAGGAACAGGGTCACCCACAGACTGGGTGTGAGTGTGAAGATGGTCCTGATGGGCAGGAGATGGACCCGCCAAATCCAGAGGAGGTGAAAACGCCTGAAGAAGGTGAAAAGCAATCACAGTGTTAAAAGAAGACACGTTGAAATGATGCAGGCTGCTCCTATGTTGGAAATTTGTTCATTAAAATTCTCCCAATAAAGCTTTACAGCCTTCTGCAAAGAAGTCTTGCGCA LRG1 (SEQ ID NO: 33) gi|49574519|ref|NM_052972.2|Homo sapiens leucine-rich alpha-2- glycoprotein 1 (LRG1), mRNAGCAGAGCTACCATGTCCTCTTGGAGCAGACAGCGACCAAAAAGCCCAGGGGGCATTCAACCCCATGTTTCTAGAACTCTGTTCCTGCTGCTGCTGTTGGCAGCCTCAGCCTGGGGGGTCACCCTGAGCCCCAAAGACTGCCAGGTGTTCCGCTCAGACCATGGCAGCTCCATCTCCTGTCAACCACCTGCCGAAATCCCCGGCTACCTGCCAGCCGACACCGTGCACCTGGCCGTGGAATTCTTCAACCTGACCCACCTGCCAGCCAACCTCCTCCAGGGCGCCTCTAAGCTCCAAGAATTGCACCTCTCCAGCAATGGGCTGGAAAGCCTCTCGCCCGAATTCCTGCGGCCAGTGCCGCAGCTGAGGGTGCTGGATCTAACCCGAAACGCCCTGACCGGGCTGCCCCCGGGCCTCTTCCAGGCCTCAGCCACCCTGGACACCCTGGTATTGAAAGAAAACCAGCTGGAGGTCCTGGAGGTCTCGTGGCTACACGGCCTGAAAGCTCTGGGGCATCTGGACCTGTCTGGGAACCGCCTCCGGAAACTGCCCCCCGGGCTGCTGGCCAACTTCACCCTCCTGCGCACCCTTGACCTTGGGGAGAACCAGTTGGAGACCTTGCCACCTGACCTCCTGAGGGGTCCGCTGCAATTAGAACGGCTACATCTAGAAGGCAACAAATTGCAAGTACTGGGAAAAGATCTCCTCTTGCCGCAGCCGGACCTGCGCTACCTCTTCCTGAACGGCAACAAGCTGGCCAGGGTGGCAGCCGGTGCCTTCCAGGGCCTGCGGCAGCTGGACATGCTGGACCTCTCCAATAACTCACTGGCCAGCGTGCCCGAGGGGCTCTGGGCATCCCTAGGGCAGCCAAACTGGGACATGCGGGATGGCTTCGACATCTCCGGCAACCCCTGGATCTGTGACCAGAACCTGAGCGACCTCTATCGTTGGCTTCAGGCCCAAAAAGACAAGATGTTTTCCCAGAATGACACGCGCTGTGCTGGGCCTGAAGCCGTGAAGGGCCAGACGCTCCTGGCAGTGGCCAAGTCCCAGTGAGACCAGGGGCTTGGGTTGAGGGTGGGGGGTCTGGTAGAACACTGCAACCCGCTTAACAAATAATCCTGCCTTTGGCCGGGTGCGGGGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCCAGGTGGGCGGATCACGAGGTCAGGAGATCGAGACCATCTTGGCTAACATGGTGAAACCCTGTCTCTACTAAAAATATAAAAAATTAGCCAGGCGTGGTGGTGGGCACCTGTAGTCCCAGCAACTCGGGAGGCTGAGGCAGGAGAATGGCGTGAACTTGGGAGGCGGAGCTTGCGGTGAGCCAAGATCGTGCCACTGCACTCTAGCCTGGGCGACAGAGCAAGACTGTCTCAAAAAAATTAAAATTAAAATTAAAAACAAATAATCCTGCCTTTTACAGGTGAAACTCGGGGCTGTCCATAGCGGCTGGGACCCCGTTTCATCCATCCATGCTTCCTAGAACACACGATGGGCTTTCCTTACCCATGCCCAAGGTGTGCCCTCCGTCTGGAATGCCGTTCCCTGTTTCCCAGATCTCTTGAACTCTGGGTTCTCCCAGCCCCTTGTCCTTCCTTCCAGCTGAGCCCTGGCCACACTGGGGCTGCCTTTCTCTGACTCTGTCTTCCCCAAGTCAGGGGGCTCTCTGAGTGCAGGGTCTGATGCTGAGTCCCACTTAGCTTGGGGTCAGAACCAAGGGGTTTAATAAATAACCCTTGAAAACTGGA MAGEA4(SEQ ID NO: 34) >gi|58530866|ref|NM_001011548.1|Homo sapiens MAGE family member A4 (MAGEA4), transcript variant 1, mRNAAGAGACAAGCGAGCTTCTGCGTCTGACTCGCAGCTTGAGACTGGCGGAGGGAAGCCCGCCCAGGCTCTATAAGGAGACAAGGTTCTGAGCAGACAGGCCAACCGGAGGACAGGATTCCCTGGAGGCCACAGAGGAGCACCAAGGAGAAGATCTGCCTGTGGGTCCCCATTGCCCAGCTTTTGCCTGCACTCTTGCCTGCTGCCCTGACCAGAGTCATCATGTCTTCTGAGCAGAAGAGTCAGCACTGCAAGCCTGAGGAAGGCGTTGAGGCCCAAGAAGAGGCCCTGGGCCTGGTGGGTGCACAGGCTCCTACTACTGAGGAGCAGGAGGCTGCTGTCTCCTCCTCCTCTCCTCTGGTCCCTGGCACCCTGGAGGAAGTGCCTGCTGCTGAGTCAGCAGGTCCTCCCCAGAGTCCTCAGGGAGCCTCTGCCTTACCCACTACCATCAGCTTCACTTGCTGGAGGCAACCCAATGAGGGTTCCAGCAGCCAAGAAGAGGAGGGGCCAAGCACCTCGCCTGACGCAGAGTCCTTGTTCCGAGAAGCACTCAGTAACAAGGTGGATGAGTTGGCTCATTTTCTGCTCCGCAAGTATCGAGCCAAGGAGCTGGTCACAAAGGCAGAAATGCTGGAGAGAGTCATCAAAAATTACAAGCGCTGCTTTCCTGTGATCTTCGGCAAAGCCTCCGAGTCCCTGAAGATGATCTTTGGCATTGACGTGAAGGAAGTGGACCCCGCCAGCAACACCTACACCCTTGTCACCTGCCTGGGCCTTTCCTATGATGGCCTGCTGGGTAATAATCAGATCTTTCCCAAGACAGGCCTTCTGATAATCGTCCTGGGCACAATTGCAATGGAGGGCGACAGCGCCTCTGAGGAGGAAATCTGGGAGGAGCTGGGTGTGATGGGGGTGTATGATGGGAGGGAGCACACTGTCTATGGGGAGCCCAGGAAACTGCTCACCCAAGATTGGGTGCAGGAAAACTACCTGGAGTACCGGCAGGTACCCGGCAGTAATCCTGCGCGCTATGAGTTCCTGTGGGGTCCAAGGGCTCTGGCTGAAACCAGCTATGTGAAAGTCCTGGAGCATGTGGTCAGGGTCAATGCAAGAGTTCGCATTGCCTACCCATCCCTGCGTGAAGCAGCTTTGTTAGAGGAGGAAGAGGGAGTCTGAGCATGAGTTGCAGCCAGGGCTGTGGGGAAGGGGCAGGGCTGGGCCAGTGCATCTAACAGCCCTGTGCAGCAGCTTCCCTTGCCTCGTGTAACATGAGGCCCATTCTTCACTCTGTTTGAAGAAAATAGTCAGTGTTCTTAGTAGTGGGTTTCTATTTTGTTGGATGACTTGGAGATTTATCTCTGTTTCCTTTTACAATTGTTGAAATGTTCCTTTTAATGGATGGTTGAATTAACTTCAGCATCCAAGTTTATGAATCGTAGTTAACGTATATTGCTGTTAATATAGTTTAGGAGTAAGAGTCTTGTTTTTTATTCAGATTGGGAAATCCGTTCTATTTTGTGAATTTGGGACATAATAACAGCAGTGGAGTAAGTATTTAGAAGTGTGAATTCACCGTGAAATAGGTGAGATAAATTAAAAGATACTTAATTCCCGCCTTATGCCTCAGTCTATTCTGTAAAATTTAAAAAATATATATGCATACCTGGATTTCCTTGGCTTCGTGAATGTAAGAGAAATTAAATCTGAATAAATAATTCTTTCTGTTAA SFTPB(SEQ ID NO: 35) >gi|288856298|ref|NM_000542.3|Homo sapiens surfactant protein B (SFTPB), transcript variant 1, mRNATGTAAATGCTCTTCTGACTAATGCAAACCATGTGTCCATAGAACCAGAAGATTTTTCCAGGGGAAAAGAGCCCCCACGCCCCGCCCAGCTATAAGGGGCCATGCACCAAGCAGGGTACCCAGGCTGCAGAGGTGCCATGGCTGAGTCACACCTGCTGCAGTGGCTGCTGCTGCTGCTGCCCACGCTCTGTGGCCCAGGCACTGCTGCCTGGACCACCTCATCCTTGGCCTGTGCCCAGGGCCCTGAGTTCTGGTGCCAAAGCCTGGAGCAAGCATTGCAGTGCAGAGCCCTAGGGCATTGCCTACAGGAAGTCTGGGGACATGTGGGAGCCGATGACCTATGCCAAGAGTGTGAGGACATCGTCCACATCCTTAACAAGATGGCCAAGGAGGCCATTTTCCAGGACACGATGAGGAAGTTCCTGGAGCAGGAGTGCAACGTCCTCCCCTTGAAGCTGCTCATGCCCCAGTGCAACCAAGTGCTTGACGACTACTTCCCCCTGGTCATCGACTACTTCCAGAACCAGACTGACTCAAACGGCATCTGTATGCACCTGGGCCTGTGCAAATCCCGGCAGCCAGAGCCAGAGCAGGAGCCAGGGATGTCAGACCCCCTGCCCAAACCTCTGCGGGACCCTCTGCCAGACCCTCTGCTGGACAAGCTCGTCCTCCCTGTGCTGCCCGGGGCCCTCCAGGCGAGGCCTGGGCCTCACACACAGGATCTCTCCGAGCAGCAATTCCCCATTCCTCTCCCCTATTGCTGGCTCTGCAGGGCTCTGATCAAGCGGATCCAAGCCATGATTCCCAAGGGTGCGCTAGCTGTGGCAGTGGCCCAGGTGTGCCGCGTGGTACCTCTGGTGGCGGGCGGCATCTGCCAGTGCCTGGCTGAGCGCTACTCCGTCATCCTGCTCGACACGCTGCTGGGCCGCATGCTGCCCCAGCTGGTCTGCCGCCTCGTCCTCCGGTGCTCCATGGATGACAGCGCTGGCCCAAGGTCGCCGACAGGAGAATGGCTGCCGCGAGACTCTGAGTGCCACCTCTGCATGTCCGTGACCACCCAGGCCGGGAACAGCAGCGAGCAGGCCATACCACAGGCAATGCTCCAGGCCTGTGTTGGCTCCTGGCTGGACAGGGAAAAGTGCAAGCAATTTGTGGAGCAGCACACGCCCCAGCTGCTGACCCTGGTGCCCAGGGGCTGGGATGCCCACACCACCTGCCAGGCCCTCGGGGTGTGTGGGACCATGTCCAGCCCTCTCCAGTGTATCCACAGCCCCGACCTTTGATGAGAACTCAGCTGTCCAGCTGCAAAGGAAAAGCCAAGTGAGACGGGCTCTGGGACCATGGTGACCAGGCTCTTCCCCTGCTCCCTGGCCCTCGCCAGCTGCCAGGCTGAAAAGAAGCCTCAGCTCCCACACCGCCCTCCTCACCGCCCTTCCTCGGCAGTCACTTCCACTGGTGGACCACGGGCCCCCAGCCCTGTGTCGGCCTTGTCTGTCTCAGCTCAACCACAGTCTGACACCAGAGCCCACTTCCATCCTCTCTGGTGTGAGGCACAGCGAGGGCAGCATCTGGAGGAGCTCTGCAGCCTCCACACCTACCACGACCTCCCAGGGCTGGGCTCAGGAAAAACCAGCCACTGCTTTACAGGACAGGGGGTTGAAGCTGAGCCCCGCCTCACACCCACCCCCATGCACTCAAAGATTGGATTTTACAGCTACTTGCAATTCAAAATTCAGAAGAATAAAAAATGGGAACATACAGAACTCTAAAAGATAGACATCAGAAATTGTTAAGTTAAGCTTTTTCAAAAAATCAGCAATTCCCCAGCGTAGTCAAGGGTGGACACTGCACGCTCTGGCATGATGGGATGGCGACCGGGCAAGCTTTCTTCCTCGAGATGCTCTGCTGCTTGAGAGCTATTGCTTTGTTAAGATATAAAAAGGGGTTTCTTTTTGTCTTTCTGTAAGGTGGACTTCCAGCTTTTGATTGAAAGTCCTAGGGTGATTCTATTTCTGCTGTGATTTATCTGCTGAAAGCTCAGCTGGGGTTGTGCAAGCTAGGGACCCATTCCTGTGTAATACAATGTCTGCACCAATGCTAATAAAGTCCTATTCTCTTTTATGAGAAAGAAAAAGACACCGTCCTTTAAAGTGCTGCAGTATGGCCAGACGTGGTGGCTCACACCTGCAATCCCAGCACCTTAGGAGGCCGAGGCAGGAGGATCCTTGAGGTCAGGAGTTCGAGACCAGCCTCGCCAACATGGTGAAACCCCATTTCTACTAAAAATACAAAAAATTAGCCAAGTGTGGTGGCATATGCCTGTAATCCCAACTACTCAGAAGGCCGAGGCAGGAGAATTACTTGAACGCAGGAGAATCACTGCAGCCCAGGAGGCAGAGGTTGCAGTGAGCCGAGATTGCACCACTGCACTCCAGCCTGGGTGACAGAGCAAGACTCCATCTCAGTAAATAAATAAATAAATAAAAAGCGCTGCAGTAGCTGTGGCCTCACCCTGAAGTCAGCGGGCCCAGGCCTACCTCACTCTCTCCCTTGGCAGAGAAGCAGACGTCCATAGCTCCTCTCCCTCACAAGCGCTCCCAGCCTGCCCTCCAGCTGCTGCTCTCCCCTCCCAGTCTCTACTCACTGGGATGAGGTTAGGTCATGAGGACACCAAAAACCTAAAAATAAACAAAAAGCCAAACAAGCCTTAGCTTTTCTTAAAGACTGAAATGCCTGGAAGTGTCCCTTTATTTATAAAATAACTTTTGTCATATTTCTTATACATGTTTCTTGTAAGAAATTCAGAAACTACAGACAAAGAGAGTGGAAATTACCCACTGTCAGGCCTCTGAGCCCAAGCTAAGCCATCATATCCCCTGTGCCCTGCACGTATACACCCAGATGGCCTGAAGCAACTGAAGATCCACAAAAGAAGTGAAAATAGCCAGTTCCTGCCTTAACTGATGACATTCCACCATTGTGATTTGTTCCTGCCCCACCCTAACTGATCAATTGACCTTGTGACAATACACCTTCCCCACCCTTGAGAAGGTGCTTTGTAATATTCTCCCCACCCACCCCACGCCCGCACCCCCGCACCCTTAAGAAGGTATTTTGTAATATTCTCTCCGCCATTGAGAATGTGCTTTGTAAGATCCACCCCCTGCCCACAAAAAATTGCTCCTAACTCCACCGCCTATCCCAAACCTACAAGAACTAATGATAATCCCACCACCCTTTGCTGACTCTTTTTGGACTCAGCCCACCTGCACCCAGGTGATTAAAAAGCTTTATTGTTCACACAAAGCCTGTTTGGTAGTCTCTTCACAGGGAAGCATGTGACACCCACAATCCCACCTAGCCCAGGAGAGAGCTACGGCAGGGTGTGTGTTTTGACACTGAGCTTGGGGCTTTTTCCATCTTCTCCCCACAGCCTCTGGCTCCACACCTCCACCGTTCAAGCGCCAGAAAGAGCTGTCTATGCAGCCTGCTCTTGGGCCTGGGGATGAGACACACAATTCATTGGCTCCTGGATTTTAAGTAGACATTTGTAAATCTATAGCTAACTACTGTCCTTAAAGCCATTGTTTCCATTACAAAATCCAACTCTCTGAGAGAAAAGGGTGTTTTAAATTTAAAAAAATAAAAACAAAAAAGTTTGATTGAGAAAAAAAAAAAAAAA XAGE-1d(SEQ ID NO: 36) >gi|18157207|emb|AJ318879.1|Homo sapiens mRNA for XAGE-1d proteinGGGAACGCGGCGGAGCTGTGAGCCGGCGACTCGGGTCCCTGAGGTCTGGATTCTTTCTCCGCTACTGAGACACGGCGGACACACACAAACACAGAACCACACAGCCAGTCCCAGGAGCCCAGTAATGGAGAGCCCCAAAAAGAAGAACCAGCAGCTGAAAGTCGGGATCCTACACCTGGGCAGCAGACAGAAGAAGATCAGGATACAGCTGAGATCCCAGGTGCTGGGAAGGGAAATGCGCGACATGGAAGGTGATCTGCAAGAGCTGCATCAGTCAAACACCGGGGATAAATCTGGATTTGGGTTCCGGCGTCAAGGTGAAGATAATACCTAAAGAGGAACACTGTAAAATGCCAGAAGCAGGTGAAGAGCAACCACAAGTTTAAATGAAGACAAGCTGAAACAACGCAAGCTGGTTTTATATTAGATATTTGACTTAAACTATCTCAATAAAGTTTTGCAGCTTTCACCAAAAAAAAAA

All literature and similar materials cited in this application,including but not limited to, patents, patent applications, articles,books, treatises, and internet web pages are expressly incorporated byreference in their entirety for any purpose. Unless defined otherwise,all technical and scientific terms used herein have the same meaning asis commonly understood by one of ordinary skill in the art to which thevarious embodiments described herein belongs. When definitions of termsin incorporated references appear to differ from the definitionsprovided in the present teachings, the definition provided in thepresent teachings shall control.

Various modifications and variations of the described compositions,methods, and uses of the technology will be apparent to those skilled inthe art without departing from the scope and spirit of the technology asdescribed. Although the technology has been described in connection withspecific exemplary embodiments, it should be understood that theinvention as claimed should not be unduly limited to such specificembodiments. Indeed, various modifications of the described modes forcarrying out the invention that are obvious to those skilled inpharmacology, biochemistry, medical science, or related fields areintended to be within the scope of the following claims.

1. A method of screening for a lung neoplasm in a sample obtained from asubject, the method comprising: a) assaying a sample from a subject foran amount of at least one RNA marker selected from the group consistingof GAGE12D, FAM83A, LRG1, XAGE-1 d, MAGEA4, SFTPB, AKAP4, and CYP24A1 ina sample obtained from a subject; b) assaying said sample for an amountof reference marker in said sample; c) comparing the amount of said atleast one RNA marker to the amount of reference marker in said sample todetermine a level of expression for said at least one marker gene insaid sample; and d) generating a record reporting the expression forsaid at least one marker gene in said sample.
 2. The method of claim 1,wherein said assaying comprises obtaining a sample comprising RNA from asubject and treating the RNA with a reverse transcriptase to form a cDNAcopy of at least a portion of said RNA.
 3. The method of claim 1,wherein said at least one RNA marker is at least two of said markers. 4.The method of claim 1, wherein said at least one RNA marker comprisesthe group consisting of GAGE, FAM83A, LRG1 and MAGEA4.
 5. The method ofclaim 1, wherein said at least one RNA marker comprises the groupconsisting of GAGE, FAM83A, LRG1, CYP24A1, XAGE1D and MAGEA4.
 6. Themethod of claim 1, wherein said reference marker is an RNA selected fromthe group consisting of CASC3 mRNA, β-actin mRNA, U1 snRNA and U6 snRNA.7. The method of claim 1, wherein the assaying comprises usingpolymerase chain reaction, nucleic acid sequencing, mass spectrometry,mass-based separation, a flap endonuclease assay, and/or target capture.8-9. (canceled)
 10. The method of claim 1, wherein assaying theexpression of the RNA marker comprises detecting an increased ordecreased expression of the RNA marker relative to a normal expressionof the marker.
 11. The method of claim 1 wherein the sample is a tissuesample, a blood sample, a serum sample, or a sputum sample. 12-13.(canceled)
 14. A kit, comprising: a) at least one oligonucleotide,wherein at least a portion of said oligonucleotide specificallyhybridizes to a marker selected from the group consisting of GAGE12D,FAM83A, LRG1, XAGE-1 d, MAGEA4, SFTPB, AKAP4, and CYP24A1, and b) atleast one additional oligonucleotide, wherein at least a portion of saidadditional oligonucleotide specifically hybridizes to a referencenucleic acid.
 15. The kit of claim 14, wherein said kit comprises atleast two additional oligonucleotides.
 16. The kit of claim 14, whereinsaid kit further comprises one or more components selected from thegroup consisting of reverse transcriptase, flap endonuclease, DNApolymerase, and a FRET cassette. 17-18. (canceled)
 19. The kit of claim16, wherein said kit comprises at least 4 oligonucleotides, wherein eachof the markers in the group consisting of GAGE, FAM83A, LRG1, and MAGEA4specifically hybridizes to at least one of said 4 oligonucleotides. 20.(canceled)
 21. The kit of claim 16, wherein said oligonucleotide isselected from one or more of a capture oligonucleotide, a pair ofnucleic acid primers, a nucleic acid probe, and an invasiveoligonucleotide.
 22. The kit of claim 14, wherein said reference markeris an RNA selected from the group consisting of CASC3, β-actin, U1 RNAand U6 RNA.
 23. A composition comprising a reaction mixture comprising acomplex of at least one RNA marker selected from the group consisting ofGAGE12D, FAM83A, LRG1, XAGE-1 d, MAGEA4, SFTPB, AKAP4, and CYP24A1 andan oligonucleotide that specifically hybridizes to said RNA marker. 24.The composition of claim 23, further comprising a complex of at leastone reference marker and an oligonucleotide that specifically hybridizesto said reference marker. 25-26. (canceled)
 27. The composition of claim24, wherein said reference marker is an RNA selected from the groupconsisting of CASC3, β-actin, U1 RNA and U6 RNA.
 28. The composition ofclaim 23, wherein said oligonucleotide is selected from one or more of acapture oligonucleotide, a pair of nucleic acid primers, a nucleic acidprobe, and an invasive oligonucleotide.
 29. The composition of claim 23,further comprising one or more components selected from the groupconsisting of reverse transcriptase, flap endonuclease, thermostable DNApolymerase, and a FRET cassette.
 30. The composition of claim 23,wherein said composition comprises a nucleic acid probe oligonucleotidecomprising a reporter molecule. 31-33. (canceled)